Skip to main content

Table 5 Performance evaluation of all measures when, NF = 10–the averaged results (K = 1–120; + 2)

From: A set theory based similarity measure for text clustering and classification

No

Dataset

Reuters-8

Web-KB

Similarity/criterion

ACC

PRE

REC

FM

GM

AMP

ACC

PRE

REC

FM

GM

AMP

1

Euclidean

0.713

0.317

0.293

0.286

0.527

0.217

0.605

0.607

0.515

0.524

0.661

0.429

2

Cosine

0.694

0.328

0.311

0.281

0.542

0.218

0.621

0.610

0.548

0.562

0.687

0.451

3

Jaccard

0.689

0.299

0.258

0.251

0.492

0.202

0.544

0.617

0.438

0.433

0.560

0.371

4

Bhattacharya

0.654

0.173

0.204

0.180

0.435

0.174

0.458

0.545

0.435

0.381

0.595

0.373

5

kullback–Leibler

0.689

0.383

0.329

0.292

0.557

0.228

0.613

0.625

0.525

0.526

0.670

0.436

6

Manhattan

0.648

0.327

0.284

0.273

0.516

0.205

0.605

0.623

0.515

0.524

0.661

0.432

7

PDSM

0.651

0.339

0.301

0.267

0.533

0.216

0.626

0.655

0.533

0.539

0.676

0.448

8

STB-SM

0.699

0.334

0.333

0.303

0.562

0.234

0.609

0.590

0.539

0.544

0.679

0.436

  1. Italic values indicate the highest values that top measures achieved for corresponding evaluation metrics