Skip to main content

Table 14 External Metric-Purity (mostly known as “Accuracy”)-K-means performance

From: A set theory based similarity measure for text clustering and classification

Similarity measure/metric

K = 4

K = 8

Reuters–18308 features

Web-KB-33025 features

Reuters–18308 features

Web-KB-33025 features

Euclidean

0.6745546742946301

0.4420100023815194

0.6651930828240801

0.5363181709930936

Cosine

0.6300871148095176

0.5651345558466302

0.5161877519178261

0.5710883543700881

Jaccard

0.5418021063580809

0.4060490592998333

0.5631257313743336

0.42462491069302216

Bhattacharya

0.6602522428812898

0.550845439390331

0.6573917565986218

0.3908073350797809

kullback–Leibler

0.5103367572487323

0.39104548702071923

0.5103367572487323

0.39175994284353416

Manhattan

0.528799895982317

0.3912836389616575

0.5342608243401378

0.40128602048106693

PDSM

0.6628526849564426

0.4165277447011193

0.6329476010921856

0.40533460347701833

STB-SM

0.626706540111819

0.6110978804477256

0.6059030035105968

0.571802810192903

  1. Italic values indicate the highest values that top measures achieved for corresponding evaluation metrics