Skip to main content

Table 6 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 5 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm

Top 5 Keywords/Unigrams

Dataset A

Dataset B

Dataset C

Average

Precision

Recall

F-score

Precision

Recall

F-score

Precision

Recall

F-score

Rank@5

ClusTop-Word-NA

.754 ± .002 (22)

.031 ± .000 (6)

.059 ± .000 (6)

.866 ± .003 (21)

.043 ± .000 (5)

.082 ± .001 (5)

.840 ± .009 (18)

.027 ± .001 (7)

.052 ± .001 (7)

(10.8)

ClusTop-BiG-NA

.786 ± .002 (18)

.034 ± .000 (2)

.064 ± .000 (2)

.857 ± .003 (22)

.046 ± .000 (2)

.086 ± .001 (2)

.833 ± .010 (20)

.029 ± .001 (4)

.056 ± .001 (3)

(8.3)

ClusTop-TriG-NA

.791 ± .002 (17)

.034 ± .000 (2)

.064 ± .000 (2)

.871 ± .003 (20)

.046 ± .000 (2)

.087 ± .001 (1)

.822 ± .009 (21)

.031 ± .001 (2)

.058 ± .001 (2)

(7.7)

ClusTop-BiHa-NA

.784 ± .002 (19)

.032 ± .000 (4)

.060 ± .000 (4)

.886 ± .003 (18)

.045 ± .000 (4)

.084 ± .001 (4)

.820 ± .009 (22)

.029 ± .001 (4)

.055 ± .001 (5)

(9.3)

ClusTop-Hash-NA

.898 ± .004 (10)

.023 ± .000 (21)

.044 ± .000 (18)

.916 ± .007 (13)

.032 ± .001 (17)

.062 ± .001 (17)

.936 ± .011 (8)

.022 ± .000 (17)

.042 ± .001 (18)

(15.4)

ClusTop-Noun-NA

.761 ± .002 (21)

.028 ± .000 (8)

.054 ± .000 (8)

.836 ± .003 (24)

.043 ± .000 (5)

.081 ± .001 (6)

.888 ± .008 (14)

.032 ± .001 (1)

.062 ± .001 (1)

(9.8)

ClusTop-H2VG-NA

.924 ± .003 (3)

.023 ± .000 (21)

.045 ± .000 (17)

.978 ± .004 (2)

.032 ± .001 (17)

.062 ± .001 (17)

.976 ± .007 (4)

.023 ± .001 (14)

.044 ± .001 (14)

(12.1)

ClusTop-H2VW-NA

.906 ± .001 (9)

.023 ± .001 (21)

.023 ± .001 (22)

.963 ± .001 (6)

.028 ± .001 (20)

.028 ± .001 (20)

.986 ± .001 (1)

.019 ± .001 (23)

.019 ± .001 (23)

(16.1)

ClusTop-H2VF-NA

.910 ± .001 (7)

.025 ± .001 (15)

.025 ± .001 (19)

.960 ± .001 (7)

.029 ± .001 (19)

.029 ± .001 (19)

.955 ± .001 (6)

.019 ± .001 (23)

.019 ± .001 (23)

(15.3)

ClusTop-Word-AH

.741 ± .003 (24)

.025 ± .000 (15)

.049 ± .000 (13)

.845 ± .005 (23)

.040 ± .000 (9)

.075 ± .001 (9)

.844 ± .010 (17)

.027 ± .001 (7)

.052 ± .001 (7)

(13.8)

ClusTop-Hash-AH

.847 ± .002 (12)

.026 ± .000 (12)

.051 ± .000 (12)

.912 ± .004 (15)

.046 ± .001 (2)

.086 ± .001 (2)

.896 ± .008 (12)

.024 ± .000 (12)

.046 ± .001 (13)

(10.2)

ClusTop-Noun-AH

.802 ± .002 (16)

.024 ± .000 (18)

.046 ± .000 (16)

.873 ± .004 (19)

.037 ± .000 (13)

.071 ± .001 (13)

.872 ± .008 (15)

.029 ± .001 (4)

.056 ± .001 (3)

(13.0)

ClusTop-H2VG-AH

.919 ± .002 (5)

.025 ± .000 (15)

.049 ± .000 (13)

.902 ± .003 (16)

.042 ± .000 (7)

.080 ± .001 (7)

.891 ± .008 (13)

.025 ± .001 (10)

.048 ± .001 (11)

(10.8)

ClusTop-H2VW-AH

.927 ± .001 (1)

.023 ± .001 (21)

.023 ± .001 (22)

.972 ± .001 (4)

.028 ± .001 (20)

.028 ± .001 (20)

.979 ± .001 (3)

.020 ± .001 (21)

.020 ± .001 (21)

(14.8)

ClusTop-H2VF-AH

.918 ± .001 (6)

.025 ± .001 (15)

.025 ± .001 (19)

.965 ± .001 (5)

.027 ± .001 (23)

.027 ± .001 (23)

.948 ± .001 (7)

.021 ± .001 (19)

.021 ± .001 (19)

(15.1)

ClusTop-Word-AM

.748 ± .001 (23)

.034 ± .000 (2)

.065 ± .000 (1)

.929 ± .003 (11)

.036 ± .000 (15)

.069 ± .001 (15)

.758 ± .016 (24)

.022 ± .000 (17)

.043 ± .001 (16)

(13.8)

ClusTop-Hash-AM

.763 ± .002 (20)

.027 ± .000 (10)

.052 ± .000 (10)

.917 ± .003 (12)

.037 ± .000 (13)

.072 ± .001 (12)

.869 ± .011 (16)

.024 ± .000 (12)

.047 ± .001 (12)

(13.0)

ClusTop-Noun-AM

.842 ± .002 (13)

.025 ± .000 (15)

.048 ± .000 (15)

.950 ± .003 (9)

.039 ± .000 (10)

.074 ± .001 (10)

.923 ± .009 (9)

.022 ± .000 (17)

.043 ± .001 (16)

(12.7)

ClusTop-H2VG-AM

.864 ± .002 (11)

.028 ± .000 (8)

.054 ± .000 (8)

.930 ± .003 (10)

.041 ± .000 (8)

.078 ± .001 (8)

.900 ± .008 (10)

.025 ± .000 (10)

.049 ± .001 (9)

(9.1)

ClusTop-H2VW-AM

.924 ± .001 (3)

.023 ± .001 (21)

.023 ± .001 (22)

.976 ± .001 (3)

.027 ± .001 (23)

.027 ± .001 (23)

.981 ± .001 (2)

.020 ± .001 (21)

.020 ± .001 (21)

(15.4)

ClusTop-H2VF-AM

.910 ± .001 (7)

.022 ± .001 (24)

.022 ± .001 (24)

.985 ± .001 (1)

.027 ± .001 (23)

.027 ± .001 (23)

.971 ± .001 (5)

.020 ± .001 (21)

.020 ± .001 (21)

(16.6)

LDA-Orig

.925 ± .001 (2)

.027 ± .000 (10)

.052 ± .000 (10)

.956 ± .002 (8)

.037 ± .000 (13)

.070 ± .001 (14)

.898 ± .010 (11)

.025 ± .000 (10)

.049 ± .001 (9)

(9.7)

LDA-Hash

.821 ± .002 (15)

.031 ± .000 (6)

.059 ± .000 (6)

.916 ± .003 (13)

.036 ± .000 (15)

.069 ± .001 (15)

.837 ± .010 (19)

.028 ± .001 (6)

.054 ± .001 (6)

(11.2)

LDA-Ment

.830 ± .002 (14)

.031 ± .000 (6)

.060 ± .000 (4)

.900 ± .003 (17)

.039 ± .000 (10)

.074 ± .001 (10)

.814 ± .018 (23)

.023 ± .000 (14)

.044 ± .001 (14)

(12.4)

  1. The rank of an algorithm’s performance for each metric are provided in brackets