Skip to main content

Table 10 R-squared values for extrapolation on size

From: Runtime prediction of big data jobs: performance comparison of machine learning algorithms and analytical models

Workload f(S)

Wordcount linear

SVM quadratic

Pagerank linear

Kmeans linear

Graph (NWeight) quadratic

Amdhal equation (1)

0.998 ± 0.000

0.965 ± 0.001

0.994 ± 0.000

0.997 ± 0.001

0.937 ± 0.006

Gustafson equation (2)

0.996 ± 0.001

0.949 ± 0.004

0.994 ± 0.000

0.996 ± 0.001

0.913 ± 0.008

ERNEST equation (3)

0.996 ± 0.001

0.958 ± 0.002

0.990 ± 0.000

0.998 ± 0.001

0.921 ± 0.008

2D plate equation (4)

0.997 ± 0.001

0.951 ± 0.003

0.993 ± 0.000

0.997 ± 0.001

0.940 ± 0.005

Connected graph equation (5)

0.257 ± 0.061

0.981 ± 0.001

0.996 ± 0.000

0.996 ± 0.001

0.940 ± 0.006

Con. graph \(c=1\) equation (6)

0.997 ± 0.001

0.978 ± 0.001

0.996 ± 0.000

0.998 ± 0.001

0.940 ± 0.006

Kernel ridge regression

0.836 ± 0.011

0.745 ± 0.004

0.836 ± 0.011

0.836 ± 0.011

0.904 ± 0.043

Gradient boost regression

0.875 ± 0.005

0.690 ± 0.003

0.875 ± 0.005

0.875 ± 0.005

0.775 ± 0.009

  1. The bold data in each column indicates the largest R-squared values in the corresponding column