Skip to main content

Table 10 R-squared values for extrapolation on size

From: Runtime prediction of big data jobs: performance comparison of machine learning algorithms and analytical models

Workload f(S) Wordcount linear SVM quadratic Pagerank linear Kmeans linear Graph (NWeight) quadratic
Amdhal equation (1) 0.998 ± 0.000 0.965 ± 0.001 0.994 ± 0.000 0.997 ± 0.001 0.937 ± 0.006
Gustafson equation (2) 0.996 ± 0.001 0.949 ± 0.004 0.994 ± 0.000 0.996 ± 0.001 0.913 ± 0.008
ERNEST equation (3) 0.996 ± 0.001 0.958 ± 0.002 0.990 ± 0.000 0.998 ± 0.001 0.921 ± 0.008
2D plate equation (4) 0.997 ± 0.001 0.951 ± 0.003 0.993 ± 0.000 0.997 ± 0.001 0.940 ± 0.005
Connected graph equation (5) 0.257 ± 0.061 0.981 ± 0.001 0.996 ± 0.000 0.996 ± 0.001 0.940 ± 0.006
Con. graph \(c=1\) equation (6) 0.997 ± 0.001 0.978 ± 0.001 0.996 ± 0.000 0.998 ± 0.001 0.940 ± 0.006
Kernel ridge regression 0.836 ± 0.011 0.745 ± 0.004 0.836 ± 0.011 0.836 ± 0.011 0.904 ± 0.043
Gradient boost regression 0.875 ± 0.005 0.690 ± 0.003 0.875 ± 0.005 0.875 ± 0.005 0.775 ± 0.009
  1. The bold data in each column indicates the largest R-squared values in the corresponding column