Skip to main content

Advertisement

Table 12 Performance of the similarity measures in data sets averaged across all clustering learners

From: Improved sqrt-cosine similarity measurement

Metric Data set ISC Cosine Gaussian
Mean HSD Mean HSD Mean HSD
Accuracy WEBKB 0.4798 A 0.4434 A 0.3824 A
  R8 0.4384 A 0.4291 A 0.4472 A
  R52 0.2320 A 0.2283 A 0.2395 A
  NEWS 0.1659 A 0.1544 A 0.1179 A
  DBLP 0.3886 A 0.3574 A 0.2640 A
  CSTR 0.4332 A 0.4095 A 0.3182 A
Average #A’s   0.3563 6 0.3370 6 0.2948 6
Purity WEBKB 0.6248 A 0.6091 A 0.5548 A
  R8 0.5769 A 0.5790 A 0.6446 A
  R52 0.4440 A 0.4225 A 0.4478 A
  NEWS 0.4234 A 0.4410 A 0.3948 A
  DBLP 0.7980 A 0.6363 A 0.6531 A
  CSTR 0.7026 A 0.6704 A 0.6700 A
Average #A’s   0.59495 6 0.55971 6 0.56085 6
NMI WEBKB 0.1500 A 0.1177 A 0.0879 A
  R8 0.1978 A 0.1912 A 0.2030 A
  R52 0.1376 A 0.1321 A 0.1179 A
  NEWS 0.0855 A 0.0761 A 0.0731 A
  DBLP 0.2439 A 0.1940 A 0.0948 A
  CSTR 0.1396 A 0.1069 A 0.0172 A
Average #A’s   0.1590 6 0.1363 6 0.0990 6