From: A set theory based similarity measure for text clustering and classification
Similarity measure/metric | K = 4 | K = 8 | ||
---|---|---|---|---|
Reuters–18308 features | Web-KB-33025 features | Reuters–18308 features | Web-KB-33025 features | |
Euclidean | 282.616242361247 | 733.8545918925699 | 175.69520099100788 | 376.55783811163883 |
Cosine | 158.059649581219 | 41.1205304663345 | 36.67050378019272 | 22.764647326363253 |
Jaccard | 10.76166549332125 | 5.452237604387431 | 12.590273044103254 | 4.818955513623545 |
Bhattacharya | 142.1019959257336 | 18.169867898545604 | 125.70926440159229 | 7.864068171394914 |
kullback–Leibler | 0.6047659734858815 | 0.16973494233360648 | 0.3996948922364028 | 0.21630702269179736 |
Manhattan | 222.98925272777933 | 687.1305760147841 | 96.46528179432781 | 338.1631503564861 |
PDSM | 132.7979522566384 | 6.947310633285274 | 87.3385616551679 | 7.265095316386779 |
STB-SM | 127.69695325126817 | 37.83093841884232 | 92.69694747153109 | 36.67724243309815 |