Fig. 2
From: Experimenting sensitivity-based anonymization framework in apache spark

Comparison between Hadoop and Spark in dealing with memory and disks: Hadoop is slower than Spark because it processes each task on two stages, map and reduce. Every two stages must output data to the disk. On the other hand, Spark operates in-memory, therefore, it is faster