Skip to main content

Table 7 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 10 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm Top 10 Keywords/Unigrams
Dataset A Dataset B Dataset C Average
Precision Recall F-score Precision Recall F-score Precision Recall F-score Rank@10
ClusTop-Word-NA .690 ± .002 (24) .033 ± .000 (7) .062 ± .000 (7) .804 ± .003 (20) .051 ± .000 (3) .095 ± .001 (3) .764 ± .010 (22) .033 ± .001 (6) .062 ± .001 (6) (10.9)
ClusTop-BiG-NA .707 ± .002 (22) .035 ± .000 (2) .065 ± .000 (2) .789 ± .004 (22) .052 ± .000 (2) .096 ± .001 (2) .765 ± .010 (20) .035 ± .001 (3) .067 ± .002 (2) (8.6)
ClusTop-TriG-NA .717 ± .002 (21) .034 ± .000 (4) .064 ± .000 (4) .811 ± .003 (19) .053 ± .000 (1) .098 ± .001 (1) .746 ± .010 (23) .036 ± .001 (1) .068 ± .001 (1) (8.3)
ClusTop-BiHa-NA .719 ± .002 (20) .034 ± .000 (4) .064 ± .000 (4) .782 ± .004 (24) .051 ± .000 (3) .095 ± .001 (3) .765 ± .010 (20) .035 ± .001 (3) .065 ± .001 (3) (9.3)
ClusTop-Hash-NA .860 ± .004 (8) .023 ± .000 (24) .045 ± .000 (17) .896 ± .008 (9) .032 ± .001 (17) .061 ± .001 (17) .925 ± .012 (8) .022 ± .000 (18) .043 ± .001 (18) (15.1)
ClusTop-Noun-NA .736 ± .002 (16) .033 ± .000 (7) .062 ± .000 (7) .803 ± .003 (21) .049 ± .000 (5) .091 ± .001 (5) .802 ± .009 (16) .035 ± .001 (3) .065 ± .001 (3) (9.2)
ClusTop-H2VG-NA .904 ± .004 (3) .024 ± .000 (21) .046 ± .000 (16) .973 ± .004 (1) .032 ± .001 (17) .061 ± .001 (17) .964 ± .008 (3) .023 ± .001 (16) .044 ± .001 (16) (12.2)
ClusTop-H2VW-NA .894 ± .001 (5) .026 ± .001 (17) .026 ± .001 (22) .944 ± .001 (4) .029 ± .001 (20) .029 ± .001 (20) .983 ± .001 (1) .020 ± .001 (23) .020 ± .001 (23) (15.0)
ClusTop-H2VF-NA .892 ± .001 (6) .026 ± .001 (17) .026 ± .001 (22) .931 ± .001 (8) .029 ± .001 (20) .029 ± .001 (20) .943 ± .001 (7) .021 ± .001 (20) .021 ± .001 (20) (15.6)
ClusTop-Word-AH .692 ± .003 (23) .025 ± .000 (19) .049 ± .000 (14) .784 ± .004 (23) .043 ± .000 (8) .080 ± .001 (8) .778 ± .010 (18) .030 ± .001 (8) .057 ± .001 (8) (14.3)
ClusTop-Hash-AH .833 ± .002 (11) .027 ± .000 (12) .051 ± .000 (13) .842 ± .004 (15) .041 ± .000 (11) .078 ± .001 (11) .857 ± .009 (12) .025 ± .000 (13) .048 ± .001 (13) (12.3)
ClusTop-Noun-AH .735 ± .002 (17) .024 ± .000 (21) .045 ± .000 (17) .827 ± .004 (17) .041 ± .000 (11) .076 ± .001 (14) .828 ± .009 (13) .031 ± .001 (7) .059 ± .001 (7) (13.8)
ClusTop-H2VG-AH .824 ± .002 (12) .027 ± .000 (12) .053 ± .000 (12) .850 ± .003 (12) .043 ± .000 (8) .080 ± .001 (8) .869 ± .009 (11) .028 ± .001 (9) .053 ± .001 (9) (10.3)
ClusTop-H2VW-AH .908 ± .001 (1) .024 ± .001 (21) .024 ± .001 (24) .944 ± .001 (4) .027 ± .001 (23) .027 ± .001 (23) .964 ± .001 (4) .020 ± .001 (23) .020 ± .001 (23) (16.2)
ClusTop-H2VF-AH .905 ± .001 (2) .027 ± .001 (15) .027 ± .001 (20) .934 ± .001 (6) .029 ± .001 (20) .029 ± .001 (20) .949 ± .001 (5) .021 ± .001 (20) .021 ± .001 (20) (14.2)
ClusTop-Word-AM .734 ± .001 (18) .036 ± .000 (1) .068 ± .000 (1) .828 ± .004 (16) .039 ± .000 (16) .073 ± .001 (16) .709 ± .013 (24) .023 ± .000 (16) .044 ± .001 (16) (13.8)
ClusTop-Hash-AM .731 ± .002 (19) .030 ± .000 (10) .058 ± .000 (10) .845 ± .004 (14) .043 ± .000 (8) .081 ± .001 (7) .823 ± .011 (14) .028 ± .001 (9) .053 ± .001 (9) (11.1)
ClusTop-Noun-AM .860 ± .001 (8) .024 ± .000 (21) .047 ± .000 (15) .932 ± .003 (7) .040 ± .000 (14) .076 ± .001 (14) .908 ± .008 (9) .024 ± .000 (14) .047 ± .001 (14) (12.9)
ClusTop-H2VG-AM .800 ± .002 (13) .031 ± .000 (9) .060 ± .000 (9) .875 ± .003 (11) .041 ± .000 (11) .078 ± .001 (11) .882 ± .008 (10) .027 ± .000 (11) .051 ± .001 (11) (10.7)
ClusTop-H2VW-AM .904 ± .001 (3) .027 ± .001 (15) .027 ± .001 (20) .964 ± .001 (2) .028 ± .001 (22) .028 ± .001 (22) .972 ± .001 (2) .021 ± .001 (20) .021 ± .001 (20) (14.0)
ClusTop-H2VF-AM .891 ± .001 (7) .027 ± .001 (15) .027 ± .001 (20) .951 ± .001 (3) .027 ± .001 (23) .027 ± .001 (23) .944 ± .001 (6) .020 ± .001 (23) .020 ± .001 (23) (15.9)
LDA-Orig .848 ± .002 (10) .029 ± .000 (11) .056 ± .000 (11) .885 ± .003 (10) .040 ± .000 (14) .076 ± .001 (14) .808 ± .011 (15) .026 ± .001 (12) .051 ± .001 (11) (12.0)
LDA-Hash .759 ± .002 (14) .034 ± .000 (4) .064 ± .000 (4) .847 ± .003 (13) .041 ± .000 (11) .078 ± .001 (11) .778 ± .010 (18) .034 ± .001 (5) .064 ± .001 (5) (9.4)
LDA-Ment .752 ± .002 (15) .033 ± .000 (7) .063 ± .000 (6) .820 ± .004 (18) .044 ± .000 (6) .082 ± .001 (6) .787 ± .013 (17) .024 ± .000 (14) .047 ± .001 (14) (11.4)
  1. The rank of an algorithm’s performance for each metric are provided in brackets