Skip to main content

Table 4 Weak scaling: d2o’s relative performance to the single-process case when increasing both, the number of processes and the global array size proportionally

From: d2o: a distributed data object for parallel high-performance computing in Python

Process count

1 (%)

2 (%)

3 (%)

4 (%)

S (%)

16 (%)

32 (%)

64 (%)

128 (%)

256 (%)

initialization

100.0

90.9

87.9

87.8

74.6

67.6

54.9

45.7

34.6

19.9

copy .empty

100.0

97.5

96.2

97.5

97.6

103.6

97.8

97.7

97.6

95.1

max

100.0

97.5

96.6

95.6

90.9

84.0

72.1

56.2

39.1

24.3

sum

100.0

98.0

95.3

93.5

87.3

79.2

65.1

48.3

32.2

19.2

sum(axis\(\,=\,\)0)

100.0

100.2

96.7

96.5

90.9

78.1

74.6

58.0

42.7

28.2

sum(axis\(\,=\,\)1)

100.0

105.2

103.2

102.2

100.6

100.0

98.3

95.8

93.2

88.6

obj[::-2]

100.0

70.4

65.9

64.0

46.2

46.6

42.8

33.6

31.1

25.3

copy

100.0

104.7

103.1

101.3

101.3

105.3

101.4

101.2

101.3

101.5

obj \(+\) 0

100.0

105.1

102.6

100.6

99.9

103.5

100.2

100.0

99.7

100.1

obj \(+\) obj

100.0

105.2

102.5

100.1

100.0

103.7

100.1

100.1

99.8

100.2

obj \(+\) = obj

100.0

102.3

99.3

98.6

98.2

101.8

98.2

98.2

98.2

98.4

sqrt

100.0

102.0

100.6

100.1

99.6

99.1

99.2

99.2

8.6

98.0

bincount

100.0

103.0

101.2

99.9

98.8

97.6

94.1

88.3

79.4

65.8

  1. The arrays used for this tests had the global shape \((n*2048, 2048)\) with n being the number of processes. By this the local data size was fixed to \(2^{22}\) elements, which is equal to \(32~\mathrm {MiB}\). “100 %” in the table corresponds to the case were the speedup is equal to the number of processes. Example: the \(95.1\,\%\) for copy_empty on 256 processes correspond to a speedup-factor of 243.5. In order to guide the eye, values \(<30\,\%\) are printed italic, values \(\ge 90\,\%\) are printed bold-italics. Please see "Weak scaling: proportional number of processes and size of data" section for discussion