Skip to main content
Fig. 27 | Journal of Big Data

Fig. 27

From: Cumulative deviation of a subpopulation from the full population

Fig. 27

Calibration of the full ImageNet-1000 training data set (the scores are the probabilities); \(n =\) 1,281,167; Kuiper’s statistic is \(0.03306 / \sigma = 111.7\), Kolmogorov’s and Smirnov’s is \(0.03306 / \sigma = 111.7\). The reliability diagram with 100 bins, each with roughly the same number of observations, looks best among the reliability diagrams; the reliability diagrams with 1000 bins each display unreal stochastic variations. Only the cumulative plot conveniently reveals that a third of all observations (specifically, those with probabilities of at least 0.97) are well-calibrated. The scalar summary statistics report profoundly statistically significant miscalibration, courtesy of the large number of observations (the actual effect size is more modest, as seen by the values without dividing by \(\sigma \))

Back to article page