Skip to main content

Table 9 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 20 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm

Top 20 Keywords/Unigrams

Dataset A

Dataset B

Dataset C

Average

Precision

Recall

F-score

Precision

Recall

F-score

Precision

Recall

F-score

Rank@20

ClusTop-Word-NA

.617 ± .002 (24)

.034 ± .000 (6)

.064 ± .000 (6)

.709 ± .003 (23)

.055 ± .000 (2)

.100 ± .001 (3)

.669 ± .009 (22)

.038 ± .001 (4)

.070 ± .002 (4)

(10.4)

ClusTop-BiG-NA

.646 ± .002 (21)

.035 ± .000 (3)

.065 ± .000 (3)

.722 ± .004 (22)

.056 ± .001 (1)

.102 ± .001 (1)

.679 ± .010 (20)

.040 ± .001 (2)

.073 ± .002 (2)

(8.3)

ClusTop-TriG-NA

.636 ± .002 (22)

.035 ± .000 (3)

.065 ± .000 (3)

.726 ± .004 (20)

.055 ± .001 (2)

.101 ± .001 (2)

.662 ± .009 (24)

.041 ± .001 (1)

.075 ± .002 (1)

(8.7)

ClusTop-BiHa-NA

.635 ± .002 (23)

.035 ± .000 (3)

.065 ± .000 (3)

.702 ± .004 (24)

.054 ± .000 (4)

.099 ± .001 (4)

.679 ± .009 (20)

.040 ± .001 (2)

.073 ± .002 (2)

(9.4)

ClusTop-Hash-NA

.831 ± .004 (9)

.024 ± .000 (23)

.045 ± .000 (18)

.873 ± .008 (9)

.031 ± .001 (17)

.060 ± .001 (17)

.911 ± .012 (8)

.023 ± .001 (17)

.044 ± .001 (17)

(15.0)

ClusTop-Noun-NA

.666 ± .002 (17)

.033 ± .000 (9)

.061 ± .000 (11)

.748 ± .003 (18)

.051 ± .000 (5)

.094 ± .001 (5)

.719 ± .010 (15)

.036 ± .001 (6)

.067 ± .001 (6)

(10.2)

ClusTop-H2VG-NA

.891 ± .004 (2)

.024 ± .000 (23)

.046 ± .000 (17)

.952 ± .005 (1)

.031 ± .001 (17)

.060 ± .001 (17)

.962 ± .008 (2)

.023 ± .001 (17)

.044 ± .001 (17)

(12.6)

ClusTop-H2VW-NA

.869 ± .001 (5)

.027 ± .001 (20)

.027 ± .001 (22)

.927 ± .001 (4)

.030 ± .001 (20)

.030 ± .001 (20)

.983 ± .001 (1)

.020 ± .001 (23)

.020 ± .001 (23)

(15.3)

ClusTop-H2VF-NA

.856 ± .001 (7)

.028 ± .001 (16)

.028 ± .001 (19)

.894 ± .001 (7)

.030 ± .001 (20)

.030 ± .001 (20)

.942 ± .001 (5)

.021 ± .001 (20)

.021 ± .001 (20)

(14.9)

ClusTop-Word-AH

.660 ± .002 (19)

.029 ± .000 (14)

.055 ± .000 (14)

.724 ± .004 (21)

.044 ± .001 (12)

.080 ± .001 (13)

.703 ± .010 (17)

.033 ± .001 (7)

.061 ± .001 (8)

(13.9)

ClusTop-Hash-AH

.757 ± .002 (10)

.033 ± .000 (9)

.063 ± .000 (8)

.801 ± .004 (13)

.044 ± .000 (12)

.082 ± .001 (11)

.836 ± .009 (12)

.026 ± .001 (15)

.050 ± .001 (15)

(11.7)

ClusTop-Noun-AH

.700 ± .002 (14)

.029 ± .000 (14)

.054 ± .000 (15)

.774 ± .004 (16)

.042 ± .000 (15)

.079 ± .001 (15)

.743 ± .010 (14)

.033 ± .001 (7)

.062 ± .001 (7)

(13.0)

ClusTop-H2VG-AH

.743 ± .002 (12)

.030 ± .000 (13)

.058 ± .000 (13)

.809 ± .003 (11)

.043 ± .000 (14)

.080 ± .001 (13)

.858 ± .009 (11)

.028 ± .001 (10)

.053 ± .001 (11)

(12.0)

ClusTop-H2VW-AH

.894 ± .001 (1)

.027 ± .001 (20)

.027 ± .001 (22)

.902 ± .001 (6)

.029 ± .001 (22)

.029 ± .001 (22)

.962 ± .001 (3)

.020 ± .001 (23)

.020 ± .001 (23)

(15.8)

ClusTop-H2VF-AH

.879 ± .001 (3)

.028 ± .001 (16)

.028 ± .001 (19)

.905 ± .001 (5)

.030 ± .001 (20)

.030 ± .001 (20)

.939 ± .001 (6)

.021 ± .001 (20)

.021 ± .001 (20)

(14.3)

ClusTop-Word-AM

.660 ± .001 (19)

.036 ± .000 (1)

.067 ± .000 (1)

.778 ± .004 (15)

.050 ± .000 (6)

.091 ± .001 (6)

.664 ± .012 (23)

.027 ± .001 (13)

.052 ± .001 (13)

(10.8)

ClusTop-Hash-AM

.691 ± .002 (15)

.033 ± .000 (9)

.062 ± .000 (9)

.796 ± .004 (14)

.047 ± .000 (9)

.087 ± .001 (9)

.745 ± .011 (13)

.030 ± .001 (9)

.058 ± .001 (9)

(10.7)

ClusTop-Noun-AM

.841 ± .001 (8)

.027 ± .000 (18)

.051 ± .000 (16)

.883 ± .003 (8)

.041 ± .000 (16)

.078 ± .001 (16)

.859 ± .009 (10)

.026 ± .000 (15)

.049 ± .001 (16)

(13.7)

ClusTop-H2VG-AM

.736 ± .002 (13)

.032 ± .000 (11)

.061 ± .000 (11)

.856 ± .003 (10)

.045 ± .000 (10)

.084 ± .001 (10)

.875 ± .008 (9)

.027 ± .000 (13)

.052 ± .001 (13)

(11.1)

ClusTop-H2VW-AM

.876 ± .001 (4)

.027 ± .001 (20)

.027 ± .001 (22)

.936 ± .001 (2)

.029 ± .001 (22)

.029 ± .001 (22)

.955 ± .001 (4)

.021 ± .001 (20)

.021 ± .001 (20)

(15.1)

ClusTop-H2VF-AM

.859 ± .001 (6)

.027 ± .001 (20)

.027 ± .001 (22)

.930 ± .001 (3)

.028 ± .001 (24)

.028 ± .001 (24)

.928 ± .001 (7)

.021 ± .001 (20)

.021 ± .001 (20)

(16.2)

LDA-Orig

.752 ± .002 (11)

.032 ± .000 (11)

.061 ± .000 (11)

.802 ± .003 (12)

.044 ± .000 (12)

.082 ± .001 (11)

.706 ± .010 (16)

.028 ± .001 (10)

.054 ± .001 (10)

(11.6)

LDA-Hash

.689 ± .002 (16)

.035 ± .000 (3)

.065 ± .000 (3)

.751 ± .003 (17)

.048 ± .000 (8)

.088 ± .001 (8)

.702 ± .010 (18)

.038 ± .001 (4)

.070 ± .001 (4)

(9.0)

LDA-Ment

.666 ± .002 (17)

.034 ± .000 (6)

.064 ± .000 (6)

.730 ± .004 (19)

.049 ± .000 (7)

.090 ± .001 (7)

.698 ± .012 (19)

.027 ± .001 (13)

.052 ± .001 (13)

(11.9)

  1. The rank of an algorithm’s performance for each metric are provided in brackets