Skip to main content

Table 7 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 10 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm

Top 10 Keywords/Unigrams

Dataset A

Dataset B

Dataset C

Average

Precision

Recall

F-score

Precision

Recall

F-score

Precision

Recall

F-score

Rank@10

ClusTop-Word-NA

.690 ± .002 (24)

.033 ± .000 (7)

.062 ± .000 (7)

.804 ± .003 (20)

.051 ± .000 (3)

.095 ± .001 (3)

.764 ± .010 (22)

.033 ± .001 (6)

.062 ± .001 (6)

(10.9)

ClusTop-BiG-NA

.707 ± .002 (22)

.035 ± .000 (2)

.065 ± .000 (2)

.789 ± .004 (22)

.052 ± .000 (2)

.096 ± .001 (2)

.765 ± .010 (20)

.035 ± .001 (3)

.067 ± .002 (2)

(8.6)

ClusTop-TriG-NA

.717 ± .002 (21)

.034 ± .000 (4)

.064 ± .000 (4)

.811 ± .003 (19)

.053 ± .000 (1)

.098 ± .001 (1)

.746 ± .010 (23)

.036 ± .001 (1)

.068 ± .001 (1)

(8.3)

ClusTop-BiHa-NA

.719 ± .002 (20)

.034 ± .000 (4)

.064 ± .000 (4)

.782 ± .004 (24)

.051 ± .000 (3)

.095 ± .001 (3)

.765 ± .010 (20)

.035 ± .001 (3)

.065 ± .001 (3)

(9.3)

ClusTop-Hash-NA

.860 ± .004 (8)

.023 ± .000 (24)

.045 ± .000 (17)

.896 ± .008 (9)

.032 ± .001 (17)

.061 ± .001 (17)

.925 ± .012 (8)

.022 ± .000 (18)

.043 ± .001 (18)

(15.1)

ClusTop-Noun-NA

.736 ± .002 (16)

.033 ± .000 (7)

.062 ± .000 (7)

.803 ± .003 (21)

.049 ± .000 (5)

.091 ± .001 (5)

.802 ± .009 (16)

.035 ± .001 (3)

.065 ± .001 (3)

(9.2)

ClusTop-H2VG-NA

.904 ± .004 (3)

.024 ± .000 (21)

.046 ± .000 (16)

.973 ± .004 (1)

.032 ± .001 (17)

.061 ± .001 (17)

.964 ± .008 (3)

.023 ± .001 (16)

.044 ± .001 (16)

(12.2)

ClusTop-H2VW-NA

.894 ± .001 (5)

.026 ± .001 (17)

.026 ± .001 (22)

.944 ± .001 (4)

.029 ± .001 (20)

.029 ± .001 (20)

.983 ± .001 (1)

.020 ± .001 (23)

.020 ± .001 (23)

(15.0)

ClusTop-H2VF-NA

.892 ± .001 (6)

.026 ± .001 (17)

.026 ± .001 (22)

.931 ± .001 (8)

.029 ± .001 (20)

.029 ± .001 (20)

.943 ± .001 (7)

.021 ± .001 (20)

.021 ± .001 (20)

(15.6)

ClusTop-Word-AH

.692 ± .003 (23)

.025 ± .000 (19)

.049 ± .000 (14)

.784 ± .004 (23)

.043 ± .000 (8)

.080 ± .001 (8)

.778 ± .010 (18)

.030 ± .001 (8)

.057 ± .001 (8)

(14.3)

ClusTop-Hash-AH

.833 ± .002 (11)

.027 ± .000 (12)

.051 ± .000 (13)

.842 ± .004 (15)

.041 ± .000 (11)

.078 ± .001 (11)

.857 ± .009 (12)

.025 ± .000 (13)

.048 ± .001 (13)

(12.3)

ClusTop-Noun-AH

.735 ± .002 (17)

.024 ± .000 (21)

.045 ± .000 (17)

.827 ± .004 (17)

.041 ± .000 (11)

.076 ± .001 (14)

.828 ± .009 (13)

.031 ± .001 (7)

.059 ± .001 (7)

(13.8)

ClusTop-H2VG-AH

.824 ± .002 (12)

.027 ± .000 (12)

.053 ± .000 (12)

.850 ± .003 (12)

.043 ± .000 (8)

.080 ± .001 (8)

.869 ± .009 (11)

.028 ± .001 (9)

.053 ± .001 (9)

(10.3)

ClusTop-H2VW-AH

.908 ± .001 (1)

.024 ± .001 (21)

.024 ± .001 (24)

.944 ± .001 (4)

.027 ± .001 (23)

.027 ± .001 (23)

.964 ± .001 (4)

.020 ± .001 (23)

.020 ± .001 (23)

(16.2)

ClusTop-H2VF-AH

.905 ± .001 (2)

.027 ± .001 (15)

.027 ± .001 (20)

.934 ± .001 (6)

.029 ± .001 (20)

.029 ± .001 (20)

.949 ± .001 (5)

.021 ± .001 (20)

.021 ± .001 (20)

(14.2)

ClusTop-Word-AM

.734 ± .001 (18)

.036 ± .000 (1)

.068 ± .000 (1)

.828 ± .004 (16)

.039 ± .000 (16)

.073 ± .001 (16)

.709 ± .013 (24)

.023 ± .000 (16)

.044 ± .001 (16)

(13.8)

ClusTop-Hash-AM

.731 ± .002 (19)

.030 ± .000 (10)

.058 ± .000 (10)

.845 ± .004 (14)

.043 ± .000 (8)

.081 ± .001 (7)

.823 ± .011 (14)

.028 ± .001 (9)

.053 ± .001 (9)

(11.1)

ClusTop-Noun-AM

.860 ± .001 (8)

.024 ± .000 (21)

.047 ± .000 (15)

.932 ± .003 (7)

.040 ± .000 (14)

.076 ± .001 (14)

.908 ± .008 (9)

.024 ± .000 (14)

.047 ± .001 (14)

(12.9)

ClusTop-H2VG-AM

.800 ± .002 (13)

.031 ± .000 (9)

.060 ± .000 (9)

.875 ± .003 (11)

.041 ± .000 (11)

.078 ± .001 (11)

.882 ± .008 (10)

.027 ± .000 (11)

.051 ± .001 (11)

(10.7)

ClusTop-H2VW-AM

.904 ± .001 (3)

.027 ± .001 (15)

.027 ± .001 (20)

.964 ± .001 (2)

.028 ± .001 (22)

.028 ± .001 (22)

.972 ± .001 (2)

.021 ± .001 (20)

.021 ± .001 (20)

(14.0)

ClusTop-H2VF-AM

.891 ± .001 (7)

.027 ± .001 (15)

.027 ± .001 (20)

.951 ± .001 (3)

.027 ± .001 (23)

.027 ± .001 (23)

.944 ± .001 (6)

.020 ± .001 (23)

.020 ± .001 (23)

(15.9)

LDA-Orig

.848 ± .002 (10)

.029 ± .000 (11)

.056 ± .000 (11)

.885 ± .003 (10)

.040 ± .000 (14)

.076 ± .001 (14)

.808 ± .011 (15)

.026 ± .001 (12)

.051 ± .001 (11)

(12.0)

LDA-Hash

.759 ± .002 (14)

.034 ± .000 (4)

.064 ± .000 (4)

.847 ± .003 (13)

.041 ± .000 (11)

.078 ± .001 (11)

.778 ± .010 (18)

.034 ± .001 (5)

.064 ± .001 (5)

(9.4)

LDA-Ment

.752 ± .002 (15)

.033 ± .000 (7)

.063 ± .000 (6)

.820 ± .004 (18)

.044 ± .000 (6)

.082 ± .001 (6)

.787 ± .013 (17)

.024 ± .000 (14)

.047 ± .001 (14)

(11.4)

  1. The rank of an algorithm’s performance for each metric are provided in brackets