Skip to main content

Table 8 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 15 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm

Top 15 Keywords/Unigrams

Dataset A

Dataset B

Dataset C

Average

Precision

Recall

F-score

Precision

Recall

F-score

Precision

Recall

F-score

Rank@15

ClusTop-Word-NA

.648 ± .002 (24)

.034 ± .000 (5)

.063 ± .000 (7)

.743 ± .004 (23)

.053 ± .000 (3)

.098 ± .001 (3)

.717 ± .010 (22)

.038 ± .001 (4)

.070 ± .002 (4)

(10.6)

ClusTop-BiG-NA

.671 ± .002 (21)

.035 ± .000 (2)

.065 ± .000 (2)

.747 ± .004 (21)

.055 ± .001 (1)

.100 ± .001 (1)

.719 ± .010 (21)

.040 ± .001 (2)

.073 ± .002 (2)

(8.1)

ClusTop-TriG-NA

.667 ± .002 (22)

.035 ± .000 (2)

.065 ± .000 (2)

.757 ± .004 (20)

.055 ± .000 (1)

.100 ± .001 (1)

.677 ± .010 (24)

.041 ± .001 (1)

.075 ± .002 (1)

(8.2)

ClusTop-BiHa-NA

.667 ± .002 (22)

.034 ± .000 (5)

.064 ± .000 (5)

.729 ± .004 (24)

.052 ± .000 (4)

.096 ± .001 (4)

.722 ± .009 (20)

.040 ± .001 (2)

.073 ± .002 (2)

(9.8)

ClusTop-Hash-NA

.848 ± .004 (9)

.023 ± .000 (24)

.045 ± .000 (18)

.880 ± .008 (9)

.031 ± .001 (17)

.060 ± .001 (17)

.914 ± .012 (8)

.023 ± .001 (17)

.044 ± .001 (17)

(15.1)

ClusTop-Noun-NA

.698 ± .002 (18)

.033 ± .000 (8)

.061 ± .000 (9)

.773 ± .003 (19)

.051 ± .000 (5)

.093 ± .001 (5)

.753 ± .010 (15)

.036 ± .001 (6)

.067 ± .001 (6)

(10.1)

ClusTop-H2VG-NA

.895 ± .004 (2)

.024 ± .000 (23)

.046 ± .000 (17)

.961 ± .005 (1)

.031 ± .001 (17)

.060 ± .001 (17)

.962 ± .008 (3)

.023 ± .001 (17)

.044 ± .001 (17)

(12.7)

ClusTop-H2VW-NA

.889 ± .001 (5)

.026 ± .001 (21)

.026 ± .001 (24)

.935 ± .001 (3)

.030 ± .001 (20)

.030 ± .001 (20)

.983 ± .001 (1)

.020 ± .001 (23)

.020 ± .001 (23)

(15.6)

ClusTop-H2VF-NA

.862 ± .001 (7)

.028 ± .001 (14)

.028 ± .001 (19)

.903 ± .001 (8)

.030 ± .001 (20)

.030 ± .001 (20)

.942 ± .001 (6)

.021 ± .001 (20)

.021 ± .001 (20)

(14.9)

ClusTop-Word-AH

.677 ± .002 (20)

.025 ± .000 (22)

.048 ± .000 (16)

.744 ± .004 (22)

.043 ± .000 (12)

.079 ± .001 (14)

.731 ± .010 (19)

.033 ± .001 (7)

.061 ± .001 (8)

(15.6)

ClusTop-Hash-AH

.802 ± .002 (10)

.030 ± .000 (12)

.058 ± .000 (12)

.831 ± .004 (13)

.044 ± .000 (10)

.083 ± .001 (10)

.847 ± .009 (12)

.026 ± .001 (15)

.050 ± .001 (15)

(12.1)

ClusTop-Noun-AH

.699 ± .002 (17)

.027 ± .000 (16)

.051 ± .000 (14)

.791 ± .004 (16)

.042 ± .000 (14)

.078 ± .001 (15)

.781 ± .010 (13)

.033 ± .001 (7)

.062 ± .001 (7)

(13.2)

ClusTop-H2VG-AH

.762 ± .002 (12)

.029 ± .000 (13)

.055 ± .000 (13)

.832 ± .003 (12)

.043 ± .000 (12)

.080 ± .001 (12)

.859 ± .009 (11)

.028 ± .001 (10)

.053 ± .001 (11)

(11.8)

ClusTop-H2VW-AH

.902 ± .001 (1)

.027 ± .001 (18)

.027 ± .001 (22)

.913 ± .001 (5)

.029 ± .001 (22)

.029 ± .001 (22)

.964 ± .001 (2)

.020 ± .001 (23)

.020 ± .001 (23)

(15.3)

ClusTop-H2VF-AH

.893 ± .001 (3)

.028 ± .001 (14)

.028 ± .001 (19)

.913 ± .001 (5)

.030 ± .001 (20)

.030 ± .001 (20)

.945 ± .001 (5)

.021 ± .001 (20)

.021 ± .001 (20)

(14.0)

ClusTop-Word-AM

.687 ± .001 (19)

.036 ± .000 (1)

.067 ± .000 (1)

.790 ± .004 (17)

.046 ± .000 (7)

.085 ± .001 (7)

.689 ± .012 (23)

.027 ± .001 (13)

.052 ± .001 (13)

(11.2)

ClusTop-Hash-AM

.713 ± .002 (15)

.033 ± .000 (8)

.063 ± .000 (7)

.816 ± .004 (14)

.045 ± .000 (9)

.083 ± .001 (10)

.770 ± .011 (14)

.030 ± .001 (9)

.058 ± .001 (9)

(10.6)

ClusTop-Noun-AM

.860 ± .001 (8)

.026 ± .000 (20)

.050 ± .000 (15)

.905 ± .003 (7)

.041 ± .000 (16)

.077 ± .001 (16)

.892 ± .008 (9)

.026 ± .000 (15)

.049 ± .001 (16)

(13.6)

ClusTop-H2VG-AM

.758 ± .002 (13)

.031 ± .000 (11)

.060 ± .000 (10)

.863 ± .003 (10)

.044 ± .000 (10)

.083 ± .001 (10)

.876 ± .008 (10)

.027 ± .000 (13)

.052 ± .001 (13)

(11.1)

ClusTop-H2VW-AM

.892 ± .001 (4)

.027 ± .001 (18)

.027 ± .001 (22)

.942 ± .001 (2)

.028 ± .001 (23)

.028 ± .001 (23)

.960 ± .001 (4)

.021 ± .001 (20)

.021 ± .001 (20)

(15.1)

ClusTop-H2VF-AM

.877 ± .001 (6)

.027 ± .001 (18)

.027 ± .001 (22)

.935 ± .001 (3)

.027 ± .001 (24)

.027 ± .001 (24)

.931 ± .001 (7)

.021 ± .001 (20)

.021 ± .001 (20)

(16.0)

LDA-Orig

.799 ± .002 (11)

.032 ± .000 (10)

.060 ± .000 (10)

.840 ± .003 (11)

.042 ± .000 (14)

.080 ± .001 (12)

.753 ± .010 (15)

.028 ± .001 (10)

.054 ± .001 (10)

(11.4)

LDA-Hash

.720 ± .002 (14)

.034 ± .000 (5)

.064 ± .000 (5)

.795 ± .003 (15)

.046 ± .000 (7)

.085 ± .001 (7)

.733 ± .010 (18)

.038 ± .001 (4)

.070 ± .001 (4)

(8.8)

LDA-Ment

.703 ± .002 (16)

.034 ± .000 (5)

.064 ± .000 (5)

.774 ± .004 (18)

.048 ± .000 (6)

.088 ± .001 (6)

.743 ± .012 (17)

.027 ± .001 (13)

.052 ± .001 (13)

(11.0)

  1. The rank of an algorithm’s performance for each metric are provided in brackets