Skip to main content
Fig. 1 | Journal of Big Data

Fig. 1

From: Cumulative deviation of a subpopulation from the full population

Fig. 1

\(n =\) 5000; Kuiper’s statistic is \(0.2037 / \sigma = 34.34\), Kolmogorov’s and Smirnov’s is \(0.2033 / \sigma = 34.27\). Figure 2 displays the ground-truth reliability diagram. The reliability diagram with 50 bins that each contain the same number of scores from the subpopulation is able to detect the notch around scores of 0.25; however, the oscillation of the bin frequencies for the subpopulation complicates disentangling real variations from statistical noise. The reliability diagrams that each have only 10 bins exhibit fewer random oscillations, but smear out the notch. In the reliability diagrams, the averages for the subpopulation are black, while the averages for the full population are gray. In the top row, the plot of cumulative deviation resolves the notch nicely while displaying minimal random fluctuations across the full range of scores. The scalar summary statistics of Kuiper and of Kolmogorov and Smirnov very successfully detect the statistically significant deviation of the subpopulation from the full population

Back to article page