From: Block size estimation for data partitioning in HPC applications using machine learning techniques
Algorithm | Dataset rows | Dataset columns | Dataset size (GB) | Infrastructure features | Best partitioning | |||
---|---|---|---|---|---|---|---|---|
# nodes | # cores | RAM | \(p_r^*\) | \(p_c^*\) | ||||
K-means | 500,000 | 1000 | 2.39 | 4 | 64 | 256 | 32 | 4 |
Random Forest | 1000 | 500,000 | 2.92 | 4 | 64 | 256 | 32 | 8 |
SVM | 10,000 | 10,000 | 1.1 | 4 | 64 | 256 | 16 | 16 |