Skip to main content

Table 8 R-squared values for a different set of workloads and models

From: Runtime prediction of big data jobs: performance comparison of machine learning algorithms and analytical models

Workload f(S) Wordcount linear SVM quadratic Pagerank linear Kmeans linear Graph (NWeight) quadratic
Amdhal equation (1) 0.995 ± 0.000 0.908 ± 0.005 0.990 ± 0.000 0.992 ± 0.002 0.901  ± 0.012
Gustafson equation (2) 0.995 ± 0.000 0.888 ± 0.002 0.988 ± 0.000 0.992 ± 0.000 0.898 ± 0.013
ERNEST equation (3) 0.994 ± 0.000 0.848 ± 0.001 0.987 ± 0.000 0.992 ± 0.002 0.916 ± 0.003
2D plate equation (4) 0.995 ± 0.000 0.916 ± 0.005 0.990 ± 0.000 0.992 ± 0.002 0.918 ± 0.009
Connected graph equation (5) 0.995 ± 0.001 0.918 ± 0.005 0.989 ± 0.000 0.992 ± 0.002 0.911 ± 0.005
Con. graph \(c=1\) equation (6) 0.995 ± 0.001 0.914 ± 0.005 0.989 ± 0.000 0.992 ± 0.002 0.911 ± 0.005
Kernel ridge regression 0.974 ± 0.002 0.934 ± 0.001 0.977 ± 0.000 0.981 ± 0.005 0.945 ± 0.009
Gradient boost regression 0.998 ± 0.000 0.995 ± 0.001 0.999 ± 0.000 0.997 ± 0.001 0.986 ± 0.003
  1. The bold data in each column indicates the largest R-squared value in the corresponding column