Skip to main content

Table 9 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 20 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm Top 20 Keywords/Unigrams
Dataset A Dataset B Dataset C Average
Precision Recall F-score Precision Recall F-score Precision Recall F-score Rank@20
ClusTop-Word-NA .617 ± .002 (24) .034 ± .000 (6) .064 ± .000 (6) .709 ± .003 (23) .055 ± .000 (2) .100 ± .001 (3) .669 ± .009 (22) .038 ± .001 (4) .070 ± .002 (4) (10.4)
ClusTop-BiG-NA .646 ± .002 (21) .035 ± .000 (3) .065 ± .000 (3) .722 ± .004 (22) .056 ± .001 (1) .102 ± .001 (1) .679 ± .010 (20) .040 ± .001 (2) .073 ± .002 (2) (8.3)
ClusTop-TriG-NA .636 ± .002 (22) .035 ± .000 (3) .065 ± .000 (3) .726 ± .004 (20) .055 ± .001 (2) .101 ± .001 (2) .662 ± .009 (24) .041 ± .001 (1) .075 ± .002 (1) (8.7)
ClusTop-BiHa-NA .635 ± .002 (23) .035 ± .000 (3) .065 ± .000 (3) .702 ± .004 (24) .054 ± .000 (4) .099 ± .001 (4) .679 ± .009 (20) .040 ± .001 (2) .073 ± .002 (2) (9.4)
ClusTop-Hash-NA .831 ± .004 (9) .024 ± .000 (23) .045 ± .000 (18) .873 ± .008 (9) .031 ± .001 (17) .060 ± .001 (17) .911 ± .012 (8) .023 ± .001 (17) .044 ± .001 (17) (15.0)
ClusTop-Noun-NA .666 ± .002 (17) .033 ± .000 (9) .061 ± .000 (11) .748 ± .003 (18) .051 ± .000 (5) .094 ± .001 (5) .719 ± .010 (15) .036 ± .001 (6) .067 ± .001 (6) (10.2)
ClusTop-H2VG-NA .891 ± .004 (2) .024 ± .000 (23) .046 ± .000 (17) .952 ± .005 (1) .031 ± .001 (17) .060 ± .001 (17) .962 ± .008 (2) .023 ± .001 (17) .044 ± .001 (17) (12.6)
ClusTop-H2VW-NA .869 ± .001 (5) .027 ± .001 (20) .027 ± .001 (22) .927 ± .001 (4) .030 ± .001 (20) .030 ± .001 (20) .983 ± .001 (1) .020 ± .001 (23) .020 ± .001 (23) (15.3)
ClusTop-H2VF-NA .856 ± .001 (7) .028 ± .001 (16) .028 ± .001 (19) .894 ± .001 (7) .030 ± .001 (20) .030 ± .001 (20) .942 ± .001 (5) .021 ± .001 (20) .021 ± .001 (20) (14.9)
ClusTop-Word-AH .660 ± .002 (19) .029 ± .000 (14) .055 ± .000 (14) .724 ± .004 (21) .044 ± .001 (12) .080 ± .001 (13) .703 ± .010 (17) .033 ± .001 (7) .061 ± .001 (8) (13.9)
ClusTop-Hash-AH .757 ± .002 (10) .033 ± .000 (9) .063 ± .000 (8) .801 ± .004 (13) .044 ± .000 (12) .082 ± .001 (11) .836 ± .009 (12) .026 ± .001 (15) .050 ± .001 (15) (11.7)
ClusTop-Noun-AH .700 ± .002 (14) .029 ± .000 (14) .054 ± .000 (15) .774 ± .004 (16) .042 ± .000 (15) .079 ± .001 (15) .743 ± .010 (14) .033 ± .001 (7) .062 ± .001 (7) (13.0)
ClusTop-H2VG-AH .743 ± .002 (12) .030 ± .000 (13) .058 ± .000 (13) .809 ± .003 (11) .043 ± .000 (14) .080 ± .001 (13) .858 ± .009 (11) .028 ± .001 (10) .053 ± .001 (11) (12.0)
ClusTop-H2VW-AH .894 ± .001 (1) .027 ± .001 (20) .027 ± .001 (22) .902 ± .001 (6) .029 ± .001 (22) .029 ± .001 (22) .962 ± .001 (3) .020 ± .001 (23) .020 ± .001 (23) (15.8)
ClusTop-H2VF-AH .879 ± .001 (3) .028 ± .001 (16) .028 ± .001 (19) .905 ± .001 (5) .030 ± .001 (20) .030 ± .001 (20) .939 ± .001 (6) .021 ± .001 (20) .021 ± .001 (20) (14.3)
ClusTop-Word-AM .660 ± .001 (19) .036 ± .000 (1) .067 ± .000 (1) .778 ± .004 (15) .050 ± .000 (6) .091 ± .001 (6) .664 ± .012 (23) .027 ± .001 (13) .052 ± .001 (13) (10.8)
ClusTop-Hash-AM .691 ± .002 (15) .033 ± .000 (9) .062 ± .000 (9) .796 ± .004 (14) .047 ± .000 (9) .087 ± .001 (9) .745 ± .011 (13) .030 ± .001 (9) .058 ± .001 (9) (10.7)
ClusTop-Noun-AM .841 ± .001 (8) .027 ± .000 (18) .051 ± .000 (16) .883 ± .003 (8) .041 ± .000 (16) .078 ± .001 (16) .859 ± .009 (10) .026 ± .000 (15) .049 ± .001 (16) (13.7)
ClusTop-H2VG-AM .736 ± .002 (13) .032 ± .000 (11) .061 ± .000 (11) .856 ± .003 (10) .045 ± .000 (10) .084 ± .001 (10) .875 ± .008 (9) .027 ± .000 (13) .052 ± .001 (13) (11.1)
ClusTop-H2VW-AM .876 ± .001 (4) .027 ± .001 (20) .027 ± .001 (22) .936 ± .001 (2) .029 ± .001 (22) .029 ± .001 (22) .955 ± .001 (4) .021 ± .001 (20) .021 ± .001 (20) (15.1)
ClusTop-H2VF-AM .859 ± .001 (6) .027 ± .001 (20) .027 ± .001 (22) .930 ± .001 (3) .028 ± .001 (24) .028 ± .001 (24) .928 ± .001 (7) .021 ± .001 (20) .021 ± .001 (20) (16.2)
LDA-Orig .752 ± .002 (11) .032 ± .000 (11) .061 ± .000 (11) .802 ± .003 (12) .044 ± .000 (12) .082 ± .001 (11) .706 ± .010 (16) .028 ± .001 (10) .054 ± .001 (10) (11.6)
LDA-Hash .689 ± .002 (16) .035 ± .000 (3) .065 ± .000 (3) .751 ± .003 (17) .048 ± .000 (8) .088 ± .001 (8) .702 ± .010 (18) .038 ± .001 (4) .070 ± .001 (4) (9.0)
LDA-Ment .666 ± .002 (17) .034 ± .000 (6) .064 ± .000 (6) .730 ± .004 (19) .049 ± .000 (7) .090 ± .001 (7) .698 ± .012 (19) .027 ± .001 (13) .052 ± .001 (13) (11.9)
  1. The rank of an algorithm’s performance for each metric are provided in brackets