Skip to main content

Table 12 Performance of the similarity measures in data sets averaged across all clustering learners

From: Improved sqrt-cosine similarity measurement

Metric

Data set

ISC

Cosine

Gaussian

Mean

HSD

Mean

HSD

Mean

HSD

Accuracy

WEBKB

0.4798

A

0.4434

A

0.3824

A

 

R8

0.4384

A

0.4291

A

0.4472

A

 

R52

0.2320

A

0.2283

A

0.2395

A

 

NEWS

0.1659

A

0.1544

A

0.1179

A

 

DBLP

0.3886

A

0.3574

A

0.2640

A

 

CSTR

0.4332

A

0.4095

A

0.3182

A

Average #A’s

 

0.3563

6

0.3370

6

0.2948

6

Purity

WEBKB

0.6248

A

0.6091

A

0.5548

A

 

R8

0.5769

A

0.5790

A

0.6446

A

 

R52

0.4440

A

0.4225

A

0.4478

A

 

NEWS

0.4234

A

0.4410

A

0.3948

A

 

DBLP

0.7980

A

0.6363

A

0.6531

A

 

CSTR

0.7026

A

0.6704

A

0.6700

A

Average #A’s

 

0.59495

6

0.55971

6

0.56085

6

NMI

WEBKB

0.1500

A

0.1177

A

0.0879

A

 

R8

0.1978

A

0.1912

A

0.2030

A

 

R52

0.1376

A

0.1321

A

0.1179

A

 

NEWS

0.0855

A

0.0761

A

0.0731

A

 

DBLP

0.2439

A

0.1940

A

0.0948

A

 

CSTR

0.1396

A

0.1069

A

0.0172

A

Average #A’s

 

0.1590

6

0.1363

6

0.0990

6