Skip to main content

Table 8 R-squared values for a different set of workloads and models

From: Runtime prediction of big data jobs: performance comparison of machine learning algorithms and analytical models

Workload f(S)

Wordcount linear

SVM quadratic

Pagerank linear

Kmeans linear

Graph (NWeight) quadratic

Amdhal equation (1)

0.995 ± 0.000

0.908 ± 0.005

0.990 ± 0.000

0.992 ± 0.002

0.901  ± 0.012

Gustafson equation (2)

0.995 ± 0.000

0.888 ± 0.002

0.988 ± 0.000

0.992 ± 0.000

0.898 ± 0.013

ERNEST equation (3)

0.994 ± 0.000

0.848 ± 0.001

0.987 ± 0.000

0.992 ± 0.002

0.916 ± 0.003

2D plate equation (4)

0.995 ± 0.000

0.916 ± 0.005

0.990 ± 0.000

0.992 ± 0.002

0.918 ± 0.009

Connected graph equation (5)

0.995 ± 0.001

0.918 ± 0.005

0.989 ± 0.000

0.992 ± 0.002

0.911 ± 0.005

Con. graph \(c=1\) equation (6)

0.995 ± 0.001

0.914 ± 0.005

0.989 ± 0.000

0.992 ± 0.002

0.911 ± 0.005

Kernel ridge regression

0.974 ± 0.002

0.934 ± 0.001

0.977 ± 0.000

0.981 ± 0.005

0.945 ± 0.009

Gradient boost regression

0.998 ± 0.000

0.995 ± 0.001

0.999 ± 0.000

0.997 ± 0.001

0.986 ± 0.003

  1. The bold data in each column indicates the largest R-squared value in the corresponding column