Skip to main content

Table 6 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 5 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm Top 5 Keywords/Unigrams
Dataset A Dataset B Dataset C Average
Precision Recall F-score Precision Recall F-score Precision Recall F-score Rank@5
ClusTop-Word-NA .754 ± .002 (22) .031 ± .000 (6) .059 ± .000 (6) .866 ± .003 (21) .043 ± .000 (5) .082 ± .001 (5) .840 ± .009 (18) .027 ± .001 (7) .052 ± .001 (7) (10.8)
ClusTop-BiG-NA .786 ± .002 (18) .034 ± .000 (2) .064 ± .000 (2) .857 ± .003 (22) .046 ± .000 (2) .086 ± .001 (2) .833 ± .010 (20) .029 ± .001 (4) .056 ± .001 (3) (8.3)
ClusTop-TriG-NA .791 ± .002 (17) .034 ± .000 (2) .064 ± .000 (2) .871 ± .003 (20) .046 ± .000 (2) .087 ± .001 (1) .822 ± .009 (21) .031 ± .001 (2) .058 ± .001 (2) (7.7)
ClusTop-BiHa-NA .784 ± .002 (19) .032 ± .000 (4) .060 ± .000 (4) .886 ± .003 (18) .045 ± .000 (4) .084 ± .001 (4) .820 ± .009 (22) .029 ± .001 (4) .055 ± .001 (5) (9.3)
ClusTop-Hash-NA .898 ± .004 (10) .023 ± .000 (21) .044 ± .000 (18) .916 ± .007 (13) .032 ± .001 (17) .062 ± .001 (17) .936 ± .011 (8) .022 ± .000 (17) .042 ± .001 (18) (15.4)
ClusTop-Noun-NA .761 ± .002 (21) .028 ± .000 (8) .054 ± .000 (8) .836 ± .003 (24) .043 ± .000 (5) .081 ± .001 (6) .888 ± .008 (14) .032 ± .001 (1) .062 ± .001 (1) (9.8)
ClusTop-H2VG-NA .924 ± .003 (3) .023 ± .000 (21) .045 ± .000 (17) .978 ± .004 (2) .032 ± .001 (17) .062 ± .001 (17) .976 ± .007 (4) .023 ± .001 (14) .044 ± .001 (14) (12.1)
ClusTop-H2VW-NA .906 ± .001 (9) .023 ± .001 (21) .023 ± .001 (22) .963 ± .001 (6) .028 ± .001 (20) .028 ± .001 (20) .986 ± .001 (1) .019 ± .001 (23) .019 ± .001 (23) (16.1)
ClusTop-H2VF-NA .910 ± .001 (7) .025 ± .001 (15) .025 ± .001 (19) .960 ± .001 (7) .029 ± .001 (19) .029 ± .001 (19) .955 ± .001 (6) .019 ± .001 (23) .019 ± .001 (23) (15.3)
ClusTop-Word-AH .741 ± .003 (24) .025 ± .000 (15) .049 ± .000 (13) .845 ± .005 (23) .040 ± .000 (9) .075 ± .001 (9) .844 ± .010 (17) .027 ± .001 (7) .052 ± .001 (7) (13.8)
ClusTop-Hash-AH .847 ± .002 (12) .026 ± .000 (12) .051 ± .000 (12) .912 ± .004 (15) .046 ± .001 (2) .086 ± .001 (2) .896 ± .008 (12) .024 ± .000 (12) .046 ± .001 (13) (10.2)
ClusTop-Noun-AH .802 ± .002 (16) .024 ± .000 (18) .046 ± .000 (16) .873 ± .004 (19) .037 ± .000 (13) .071 ± .001 (13) .872 ± .008 (15) .029 ± .001 (4) .056 ± .001 (3) (13.0)
ClusTop-H2VG-AH .919 ± .002 (5) .025 ± .000 (15) .049 ± .000 (13) .902 ± .003 (16) .042 ± .000 (7) .080 ± .001 (7) .891 ± .008 (13) .025 ± .001 (10) .048 ± .001 (11) (10.8)
ClusTop-H2VW-AH .927 ± .001 (1) .023 ± .001 (21) .023 ± .001 (22) .972 ± .001 (4) .028 ± .001 (20) .028 ± .001 (20) .979 ± .001 (3) .020 ± .001 (21) .020 ± .001 (21) (14.8)
ClusTop-H2VF-AH .918 ± .001 (6) .025 ± .001 (15) .025 ± .001 (19) .965 ± .001 (5) .027 ± .001 (23) .027 ± .001 (23) .948 ± .001 (7) .021 ± .001 (19) .021 ± .001 (19) (15.1)
ClusTop-Word-AM .748 ± .001 (23) .034 ± .000 (2) .065 ± .000 (1) .929 ± .003 (11) .036 ± .000 (15) .069 ± .001 (15) .758 ± .016 (24) .022 ± .000 (17) .043 ± .001 (16) (13.8)
ClusTop-Hash-AM .763 ± .002 (20) .027 ± .000 (10) .052 ± .000 (10) .917 ± .003 (12) .037 ± .000 (13) .072 ± .001 (12) .869 ± .011 (16) .024 ± .000 (12) .047 ± .001 (12) (13.0)
ClusTop-Noun-AM .842 ± .002 (13) .025 ± .000 (15) .048 ± .000 (15) .950 ± .003 (9) .039 ± .000 (10) .074 ± .001 (10) .923 ± .009 (9) .022 ± .000 (17) .043 ± .001 (16) (12.7)
ClusTop-H2VG-AM .864 ± .002 (11) .028 ± .000 (8) .054 ± .000 (8) .930 ± .003 (10) .041 ± .000 (8) .078 ± .001 (8) .900 ± .008 (10) .025 ± .000 (10) .049 ± .001 (9) (9.1)
ClusTop-H2VW-AM .924 ± .001 (3) .023 ± .001 (21) .023 ± .001 (22) .976 ± .001 (3) .027 ± .001 (23) .027 ± .001 (23) .981 ± .001 (2) .020 ± .001 (21) .020 ± .001 (21) (15.4)
ClusTop-H2VF-AM .910 ± .001 (7) .022 ± .001 (24) .022 ± .001 (24) .985 ± .001 (1) .027 ± .001 (23) .027 ± .001 (23) .971 ± .001 (5) .020 ± .001 (21) .020 ± .001 (21) (16.6)
LDA-Orig .925 ± .001 (2) .027 ± .000 (10) .052 ± .000 (10) .956 ± .002 (8) .037 ± .000 (13) .070 ± .001 (14) .898 ± .010 (11) .025 ± .000 (10) .049 ± .001 (9) (9.7)
LDA-Hash .821 ± .002 (15) .031 ± .000 (6) .059 ± .000 (6) .916 ± .003 (13) .036 ± .000 (15) .069 ± .001 (15) .837 ± .010 (19) .028 ± .001 (6) .054 ± .001 (6) (11.2)
LDA-Ment .830 ± .002 (14) .031 ± .000 (6) .060 ± .000 (4) .900 ± .003 (17) .039 ± .000 (10) .074 ± .001 (10) .814 ± .018 (23) .023 ± .000 (14) .044 ± .001 (14) (12.4)
  1. The rank of an algorithm’s performance for each metric are provided in brackets