From: A survey of open source tools for machine learning with big data in the Hadoop ecosystem
Current stable release (as of June 1, 2015) | Execution model | Supported languages | Associated ML tools | In-memory processing | Low latency | Fault tolerance | Enterprise support | |
---|---|---|---|---|---|---|---|---|
MapReduce | 2.7.0 | Batch | Java | Mahout | × | × | ✓ | × |
Spark | 1.3.1 | Batch, streaming | Java, Python, R, Scala | MLlib, Mahout, H2O | ✓ | ✓ | ✓ | ✓ |
Flink | 0.8.1 | Batch, streaming | Java, Scala | Flink-ML, SAMOA | ✓ | ✓ | ✓ | × |
Storm | 0.9.4 | Streaming | Any | SAMOA | ✓ | ✓ | ✓ | × |
H2O | 3.0.0.12 | Batch | Java, Python, R, Scala | H2O, Mahout, MLlib | ✓ | ✓ | ✓ | ✓ |