From: A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
Split sizes (MB)
Execution time (s)
MapReduce input splits (TeraSort)
256
21,014
Spark input splits (TeraSort)
512 & 1024
3780 & 3439
MapReduce shuffle (TeraSort)
150 & 45
24,250
Spark shuffle (TeraSort)
128 & 192
6540