Journal of Big Data

Table 16 External metric—Rand index-K-means performance

From: A set theory based similarity measure for text clustering and classification

Similarity measure/Metric	K = 4		K = 8
Similarity measure/Metric	Reuters–18308 features	Web-KB-33025 features	Reuters–18308 features	Web-KB-33025 features
Euclidean	0.18579386266996	0.07259267580718086	0.134417241084232	0.13997725114650594
Cosine	0.13679798891326	0.19584148736739818	0.038271439259564	0.1388214135942175
Jaccard	0.03156196525375	0.03337323105349306	0.03325438109622	0.02868306344839796
Bhattacharya	0.10437193466823	0.21881970871431122	0.037753839600227	− 0.00837183862883132
kullback–Leibler	− 6.969300641e-05	− 0.0003339640498897	− 0.00089845850765	− 0.00025681169419640
Manhattan	− 0.0456795771624	0.05064565361647291	− 0.02973126512374	0.003861173313610305
PDSM	0.10879494053512	0.00359089370980009	0.116230029219350	− 0.00999313066207844
STB-SM	0.17377955245469	0.23459410623220853	0.108893811930251	0.17012239790902425

Italic values indicate the highest values that top measures achieved for corresponding evaluation metrics

Back to article page