Skip to main content

Table 10 False positive percentages

From: Addressing big data variety using an automated approach for data characterization

Classification

Standard RegEx

“Boosted” RegEx

confidence level ≤ 50%

“Boosted” RegEx

confidence level > 50%

Cards

18,422

7,337 (39.83%)

11,085 (60.17%)

Lists

5,394,547

1,565,160 (29.01%)

3,829,387 (70.99%)

Total

5,412,969

1,572,1497

3,840,472