From: A set theory based similarity measure for text clustering and classification
Similarity measure/Metric | K = 4 | K = 8 | ||
---|---|---|---|---|
Reuters–18308 features | Web-KB-33025 features | Reuters–18308 features | Web-KB -33025 features | |
Euclidean | 0.199511647995826 | 0.0673834489462676 | 0.179814979894203 | 0.15452274170402275 |
Cosine | 0.176068659196536 | 0.2225985559938000 | 0.049343505745102 | 0.1665642553582776 |
Jaccard | 0.055007933247357 | 0.0318290213093328 | 0.050929724394223 | 0.03093748911398607 |
Bhattacharya | 0.161670093558948 | 0.3071187147856372 | 0.212872136220606 | 0.04952824171446496 |
kullback–Leibler | 0.138798782038987 | 0.0661952142325757 | 0.08238615904440288 | 0.09274073186745294 |
Manhattan | 0.140585609897670 | 0.0776936270894379 | 0.162146390404297 | 0.2204324456530784 |
PDSM | 0.224510552205581 | 0.0377046283518796 | 0.199091523319667 | 0.07172116609953971 |
STB-SM | 0.267581280659119 | 0.2935685999112778 | 0.220934861375524 | 0.17402803377334777 |