Fig. 2From: Accelerating neural network training with distributed asynchronous and selective optimization (DASO)Local synchronization Schematic of the local synchronization step for a single node with four GPUs. The gradients from each GPU are averaged, then each GPU’s gradients are set to the resultBack to article page