Skip to main content

Table 1 Hyperparameters used to train ResNet-50 using the ImageNet-2012 dataset

From: Accelerating neural network training with distributed asynchronous and selective optimization (DASO)

Data Loader DALI [37]
Local Optimizer SGD
Local Optimizer Parameters Momentum: 0.9 Weight Decay: 0.0001
Epochs 90
Learning Rate (LR) Decay Reduce on Stable
LR Parameters Stable Epochs Before Change: 5 Decay Factor: 0.5
LR Warmup Phase 5 epochs, see Goyal et al. [38]
Maximum LR Scaled by number of GPUs [38]
Loss Function Cross Entropy