From: A set theory based similarity measure for text clustering and classification
Similarity measure/Metric | K = 4 | K = 8 | ||
---|---|---|---|---|
Reuters–18308 features | Web-KB-33025 features | Reuters–18308 features | Web-KB-33025 features | |
Euclidean | 0.18579386266996 | 0.07259267580718086 | 0.134417241084232 | 0.13997725114650594 |
Cosine | 0.13679798891326 | 0.19584148736739818 | 0.038271439259564 | 0.1388214135942175 |
Jaccard | 0.03156196525375 | 0.03337323105349306 | 0.03325438109622 | 0.02868306344839796 |
Bhattacharya | 0.10437193466823 | 0.21881970871431122 | 0.037753839600227 | − 0.00837183862883132 |
kullback–Leibler | − 6.969300641e-05 | − 0.0003339640498897 | − 0.00089845850765 | − 0.00025681169419640 |
Manhattan | − 0.0456795771624 | 0.05064565361647291 | − 0.02973126512374 | 0.003861173313610305 |
PDSM | 0.10879494053512 | 0.00359089370980009 | 0.116230029219350 | − 0.00999313066207844 |
STB-SM | 0.17377955245469 | 0.23459410623220853 | 0.108893811930251 | 0.17012239790902425 |