Skip to main content

Table 8 Comparison of ClusTop algorithm against various baselines, in terms of Precision (Pre), Recall (Rec) and F-score (FS) for the top 15 keywords/unigrams of each topic

From: A clustering-based topic model using word networks and word embeddings

Algorithm Top 15 Keywords/Unigrams
Dataset A Dataset B Dataset C Average
Precision Recall F-score Precision Recall F-score Precision Recall F-score Rank@15
ClusTop-Word-NA .648 ± .002 (24) .034 ± .000 (5) .063 ± .000 (7) .743 ± .004 (23) .053 ± .000 (3) .098 ± .001 (3) .717 ± .010 (22) .038 ± .001 (4) .070 ± .002 (4) (10.6)
ClusTop-BiG-NA .671 ± .002 (21) .035 ± .000 (2) .065 ± .000 (2) .747 ± .004 (21) .055 ± .001 (1) .100 ± .001 (1) .719 ± .010 (21) .040 ± .001 (2) .073 ± .002 (2) (8.1)
ClusTop-TriG-NA .667 ± .002 (22) .035 ± .000 (2) .065 ± .000 (2) .757 ± .004 (20) .055 ± .000 (1) .100 ± .001 (1) .677 ± .010 (24) .041 ± .001 (1) .075 ± .002 (1) (8.2)
ClusTop-BiHa-NA .667 ± .002 (22) .034 ± .000 (5) .064 ± .000 (5) .729 ± .004 (24) .052 ± .000 (4) .096 ± .001 (4) .722 ± .009 (20) .040 ± .001 (2) .073 ± .002 (2) (9.8)
ClusTop-Hash-NA .848 ± .004 (9) .023 ± .000 (24) .045 ± .000 (18) .880 ± .008 (9) .031 ± .001 (17) .060 ± .001 (17) .914 ± .012 (8) .023 ± .001 (17) .044 ± .001 (17) (15.1)
ClusTop-Noun-NA .698 ± .002 (18) .033 ± .000 (8) .061 ± .000 (9) .773 ± .003 (19) .051 ± .000 (5) .093 ± .001 (5) .753 ± .010 (15) .036 ± .001 (6) .067 ± .001 (6) (10.1)
ClusTop-H2VG-NA .895 ± .004 (2) .024 ± .000 (23) .046 ± .000 (17) .961 ± .005 (1) .031 ± .001 (17) .060 ± .001 (17) .962 ± .008 (3) .023 ± .001 (17) .044 ± .001 (17) (12.7)
ClusTop-H2VW-NA .889 ± .001 (5) .026 ± .001 (21) .026 ± .001 (24) .935 ± .001 (3) .030 ± .001 (20) .030 ± .001 (20) .983 ± .001 (1) .020 ± .001 (23) .020 ± .001 (23) (15.6)
ClusTop-H2VF-NA .862 ± .001 (7) .028 ± .001 (14) .028 ± .001 (19) .903 ± .001 (8) .030 ± .001 (20) .030 ± .001 (20) .942 ± .001 (6) .021 ± .001 (20) .021 ± .001 (20) (14.9)
ClusTop-Word-AH .677 ± .002 (20) .025 ± .000 (22) .048 ± .000 (16) .744 ± .004 (22) .043 ± .000 (12) .079 ± .001 (14) .731 ± .010 (19) .033 ± .001 (7) .061 ± .001 (8) (15.6)
ClusTop-Hash-AH .802 ± .002 (10) .030 ± .000 (12) .058 ± .000 (12) .831 ± .004 (13) .044 ± .000 (10) .083 ± .001 (10) .847 ± .009 (12) .026 ± .001 (15) .050 ± .001 (15) (12.1)
ClusTop-Noun-AH .699 ± .002 (17) .027 ± .000 (16) .051 ± .000 (14) .791 ± .004 (16) .042 ± .000 (14) .078 ± .001 (15) .781 ± .010 (13) .033 ± .001 (7) .062 ± .001 (7) (13.2)
ClusTop-H2VG-AH .762 ± .002 (12) .029 ± .000 (13) .055 ± .000 (13) .832 ± .003 (12) .043 ± .000 (12) .080 ± .001 (12) .859 ± .009 (11) .028 ± .001 (10) .053 ± .001 (11) (11.8)
ClusTop-H2VW-AH .902 ± .001 (1) .027 ± .001 (18) .027 ± .001 (22) .913 ± .001 (5) .029 ± .001 (22) .029 ± .001 (22) .964 ± .001 (2) .020 ± .001 (23) .020 ± .001 (23) (15.3)
ClusTop-H2VF-AH .893 ± .001 (3) .028 ± .001 (14) .028 ± .001 (19) .913 ± .001 (5) .030 ± .001 (20) .030 ± .001 (20) .945 ± .001 (5) .021 ± .001 (20) .021 ± .001 (20) (14.0)
ClusTop-Word-AM .687 ± .001 (19) .036 ± .000 (1) .067 ± .000 (1) .790 ± .004 (17) .046 ± .000 (7) .085 ± .001 (7) .689 ± .012 (23) .027 ± .001 (13) .052 ± .001 (13) (11.2)
ClusTop-Hash-AM .713 ± .002 (15) .033 ± .000 (8) .063 ± .000 (7) .816 ± .004 (14) .045 ± .000 (9) .083 ± .001 (10) .770 ± .011 (14) .030 ± .001 (9) .058 ± .001 (9) (10.6)
ClusTop-Noun-AM .860 ± .001 (8) .026 ± .000 (20) .050 ± .000 (15) .905 ± .003 (7) .041 ± .000 (16) .077 ± .001 (16) .892 ± .008 (9) .026 ± .000 (15) .049 ± .001 (16) (13.6)
ClusTop-H2VG-AM .758 ± .002 (13) .031 ± .000 (11) .060 ± .000 (10) .863 ± .003 (10) .044 ± .000 (10) .083 ± .001 (10) .876 ± .008 (10) .027 ± .000 (13) .052 ± .001 (13) (11.1)
ClusTop-H2VW-AM .892 ± .001 (4) .027 ± .001 (18) .027 ± .001 (22) .942 ± .001 (2) .028 ± .001 (23) .028 ± .001 (23) .960 ± .001 (4) .021 ± .001 (20) .021 ± .001 (20) (15.1)
ClusTop-H2VF-AM .877 ± .001 (6) .027 ± .001 (18) .027 ± .001 (22) .935 ± .001 (3) .027 ± .001 (24) .027 ± .001 (24) .931 ± .001 (7) .021 ± .001 (20) .021 ± .001 (20) (16.0)
LDA-Orig .799 ± .002 (11) .032 ± .000 (10) .060 ± .000 (10) .840 ± .003 (11) .042 ± .000 (14) .080 ± .001 (12) .753 ± .010 (15) .028 ± .001 (10) .054 ± .001 (10) (11.4)
LDA-Hash .720 ± .002 (14) .034 ± .000 (5) .064 ± .000 (5) .795 ± .003 (15) .046 ± .000 (7) .085 ± .001 (7) .733 ± .010 (18) .038 ± .001 (4) .070 ± .001 (4) (8.8)
LDA-Ment .703 ± .002 (16) .034 ± .000 (5) .064 ± .000 (5) .774 ± .004 (18) .048 ± .000 (6) .088 ± .001 (6) .743 ± .012 (17) .027 ± .001 (13) .052 ± .001 (13) (11.0)
  1. The rank of an algorithm’s performance for each metric are provided in brackets