Skip to main content

Table 6 The best execution time of MapReduce and Spark with Terasort workload

From: A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench

 

Split sizes (MB)

Execution time (s)

MapReduce input splits (TeraSort)

256

21,014

Spark input splits (TeraSort)

512 & 1024

3780 & 3439

MapReduce shuffle (TeraSort)

150 & 45

24,250

Spark shuffle (TeraSort)

128 & 192

6540