d2o: a distributed data object for parallel high-performance computing in Python

Journal of Big Data

Table 4 Weak scaling: d2o’s relative performance to the single-process case when increasing both, the number of processes and the global array size proportionally

Process count	1 (%)	2 (%)	3 (%)	4 (%)	S (%)	16 (%)	32 (%)	64 (%)	128 (%)	256 (%)
initialization	*100.0*	*90.9*	87.9	87.8	74.6	67.6	54.9	45.7	34.6	19.9
copy .empty	*100.0*	*97.5*	*96.2*	*97.5*	*97.6*	*103.6*	*97.8*	*97.7*	*97.6*	*95.1*
max	*100.0*	*97.5*	*96.6*	*95.6*	*90.9*	84.0	72.1	56.2	39.1	24.3
sum	*100.0*	*98.0*	*95.3*	*93.5*	87.3	79.2	65.1	48.3	32.2	19.2
sum(axis\(\,=\,\)0)	*100.0*	*100.2*	*96.7*	*96.5*	*90.9*	78.1	74.6	58.0	42.7	28.2
sum(axis\(\,=\,\)1)	*100.0*	*105.2*	*103.2*	*102.2*	*100.6*	*100.0*	*98.3*	*95.8*	*93.2*	88.6
obj[::-2]	*100.0*	70.4	65.9	64.0	46.2	46.6	42.8	33.6	31.1	25.3
copy	*100.0*	*104.7*	*103.1*	*101.3*	*101.3*	*105.3*	*101.4*	*101.2*	*101.3*	*101.5*
obj \(+\) 0	*100.0*	*105.1*	*102.6*	*100.6*	*99.9*	*103.5*	*100.2*	*100.0*	*99.7*	*100.1*
obj \(+\) obj	*100.0*	*105.2*	*102.5*	*100.1*	*100.0*	*103.7*	*100.1*	*100.1*	*99.8*	*100.2*
obj \(+\) = obj	*100.0*	*102.3*	*99.3*	*98.6*	*98.2*	*101.8*	*98.2*	*98.2*	*98.2*	*98.4*
sqrt	*100.0*	*102.0*	*100.6*	*100.1*	*99.6*	*99.1*	*99.2*	*99.2*	*8.6*	*98.0*
bincount	*100.0*	*103.0*	*101.2*	*99.9*	*98.8*	*97.6*	*94.1*	88.3	79.4	65.8

The arrays used for this tests had the global shape \((n*2048, 2048)\) with n being the number of processes. By this the local data size was fixed to \(2^{22}\) elements, which is equal to \(32~\mathrm {MiB}\). “100 %” in the table corresponds to the case were the speedup is equal to the number of processes. Example: the \(95.1\,\%\) for copy_empty on 256 processes correspond to a speedup-factor of 243.5. In order to guide the eye, values \(<30\,\%\) are printed italic, values \(\ge 90\,\%\) are printed bold-italics. Please see "Weak scaling: proportional number of processes and size of data" section for discussion