Skip to main content

Table 17 Internal Metric-Calinski-Harabasz Index–K-means performance

From: A set theory based similarity measure for text clustering and classification

Similarity measure/metric

K = 4

K = 8

Reuters–18308 features

Web-KB-33025 features

Reuters–18308 features

Web-KB-33025 features

Euclidean

282.616242361247

733.8545918925699

175.69520099100788

376.55783811163883

Cosine

158.059649581219

41.1205304663345

36.67050378019272

22.764647326363253

Jaccard

10.76166549332125

5.452237604387431

12.590273044103254

4.818955513623545

Bhattacharya

142.1019959257336

18.169867898545604

125.70926440159229

7.864068171394914

kullback–Leibler

0.6047659734858815

0.16973494233360648

0.3996948922364028

0.21630702269179736

Manhattan

222.98925272777933

687.1305760147841

96.46528179432781

338.1631503564861

PDSM

132.7979522566384

6.947310633285274

87.3385616551679

7.265095316386779

STB-SM

127.69695325126817

37.83093841884232

92.69694747153109

36.67724243309815

  1. Italic values indicate the highest values that top measures achieved for corresponding evaluation metrics