From: DLA-E: a deep learning accelerator for endoscopic images classification
DNN | MobileNet V2 | ResNet | VGG19 | ||||||
---|---|---|---|---|---|---|---|---|---|
Dataflow | kcp_ws | xp_ws | rs | kcp_ws | xp_ws | rs | kcp_ws | xp_ws | rs |
Number of MACs | 2.98 × 108 | 2.98 × 108 | 2.98 × 108 | 1.01 × 1010 | 1.01 × 1010 | 1.01 × 1010 | 1.77 × 1010 | 1.77 × 1010 | 1.77 × 1010 |
Avg L1 Size Requirement | 7 | 7 | 22 | 8 | 8 | 37 | 15 | 15 | 59 |
Max L1 Size Requirement | 18 | 18 | 96 | 98 | 98 | 224 | 18 | 18 | 96 |
Avg L2 Size Requirement | 977 | 887 | 1,087 | 765 | 79 | 289 | 1,577 | 367 | 705 |
Max L2 Size Requirement | 4,608 | 4,608 | 5,408 | 4,608 | 10,766 | 3,528 | 4,608 | 3,996 | 2,720 |
Avg Number of Utilized PEs | 208 | 76 | 77 | 254 | 19 | 23 | 243 | 54 | 99 |
Max Number of Utilized PEs | 256 | 240 | 240 | 256 | 110 | 191 | 256 | 222 | 222 |
Avg NoC Bandwidth | 49.23 | 44.91 | 43.16 | 30.01 | 13.66 | 13.91 | 40.48 | 6.66 | 10.91 |
Max NoC Bandwidth | 158.40 | 143.97 | 143.97 | 159 | 56 | 52.71 | 159.92 | 25 | 24.16 |
Avg Throughput (MACs / Cycle) | 33.92 | 22.38 | 20.38 | 41.75 | 18.32 | 28.71 | 39.21 | 30.27 | 96.97 |
Max Throughput (MACs / Cycle) | 48.00 | 56.00 | 146.67 | 42.67 | 56.00 | 197.73 | 42.67 | 54 | 214.47 |
Total Throughput (MACs / Cycle) | 3,596.52 | 2,373 | 2,418 | 13,026 | 5,717 | 8,959 | 1,490 | 1,150 | 3,685 |
Total Energy Consumption (X MAC Energy) | 4.56 × 109 | 1.31 × 1010 | 8.40 × 109 | 1.21 × 1011 | 3.18 × 1011 | 2.10 × 1011 | 1.75 × 1011 | 2.26 × 1011 | 1.93 × 1011 |
Total Runtime (Cycles) | 1.73 × 107 | 4.48 × 107 | 4.75 × 107 | 5.00 × 108 | 1.41 × 109 | 1.01 × 109 | 8.85 × 108 | 1.30 × 109 | 6.12 × 108 |