Skip to main content

Table 1 Features of the systems

From: Programming big data analysis: principles and solutions

System

Programming model

Type of parallelism

Level of abstraction

Verbosity

Class of applications

Hadoop

MapReduce

Data

Low

High

General-purpose (batch processing)

Spark

Workflow

Data/Task

Low

Low

General-purpose (batch and stream processing, machine learning, graph analysis, structured data analysis)

Storm

Workflow

Data/Task/Pipeline

Medium

Medium

Stream processing (real-time)

Hama

BSP

Data

Low

Medium

Massive scientific computations (matrix computation, graph analysis, machine and deep learning)

MPI

Message passing

Data

Low

Low

General-purpose (iterative parallel applications)

Hive

SQL-Like

Data

High

Low

Data querying and reporting

Pig

SQL-Like

Data/Task

Medium

Low

Data querying and analysis