Fig. 7From: Large scale performance analysis of distributed deep learning frameworks for convolutional neural networksParallel efficiency of Horovod and PyTorch-DDP on up to 1024 GPUs training a ResNet101 with the DALI data loader and compressed ImageNet dataset and native PyTorch data loader and uncompressed ImageNet dataset, averaged over three runs. Black line denotes the ideal case. The variance between runs is small (in general \(<5\%\)) and therefore not shownBack to article page