Skip to main content

Table 5 Comparison of ClusTop algorithm against various baselines, in terms of Topic Coherence (TC) and Pointwise Mutual Information (PMI) for the top 15 and 20 keywords

From: A clustering-based topic model using word networks and word embeddings

Algorithm Top 15 Keywords/Unigrams Top 20 Keywords/Unigrams
Dataset A Dataset B Dataset C Average Dataset A Dataset B Dataset C Average
TC PMI TC PMI TC PMI Rank@15 TC PMI TC PMI TC PMI Rank@20
ClusTop-Word-NA − 409.7 (21) − 157.4 (17) − 382.6 (19) − 115.5 (16) − 403.5 (18) − 147.6 (21) (18.7) − 768.2 (21) − 318.1 (19) − 695.1 (19) − 234.8 (17) − 732.7 (18) − 247.8 (21) (19.2)
ClusTop-BiG-NA − 357.2 (20) − 108.3 (15) − 360.3 (17) − 77.2 (12) − 447.9 (20) − 116.5 (18) (17.0) − 645.5 (20) − 212.3 (16) − 633.7 (17) − 147.8 (13) − 794.7 (20) − 165.4 (18) (17.3)
ClusTop-TriG-NA − 289.2 (16) − 83.1 (13) − 382.0 (18) − 86.2 (15) − 451.6 (21) − 145.4 (20) (17.2) − 530.2 (16) − 170.0 (14) − 674.8 (18) − 183.1 (14) − 804.0 (21) − 222.8 (19) (17.0)
ClusTop-BiHa-NA − 179.9 (13) − 24.2 (10) − 324.4 (15) − 82.8 (14) − 389.5 (17) − 98.6 (17) (14.3) − 319.3 (13) − 64.1 (10) − 569.2 (15) − 189.6 (15) − 692.0 (17) − 141.3 (17) (14.5)
ClusTop-Hash-NA − 38.4 (2) − 4.0 (5) − 116.4 (5) − 11.3 (7) − 98.6 (6) 13.1 (4) (4.8) − 63.7 (2) − 11.0 (6) − 195.2 (7) − 9.7 (6) − 148.8 (9) 28.2 (4) (5.7)
ClusTop-Noun-NA − 139.1 (8) − 23.3 (9) − 208.5 (11) − 31.4 (9) − 219.9 (15) − 29.6 (11) (10.5) − 242.1 (8) − 63.7 (9) − 363.5 (11) − 86.9 (9) − 384.7 (15) − 36.4 (10) (10.3)
ClusTop-H2VG-NA − 160.2 (10) − 77.7 (12) − 183.4 (9) − 70.6 (11) − 75.6 (3) − 24.8 (10) (9.2) − 292.1 (11) − 127.5 (11) − 295.8 (10) − 107.8 (11) − 75.6 (3) − 24.8 (8) (9.0)
ClusTop-H2VW-NA − 49.9 (3) 72.7 (2) − 61.4 (2) 96.8 (1) − 19.3 (1) 52.5 (2) (1.8) − 73.4 (3) 111.3 (2) − 95.7 (2) 161.4 (1) − 19.3 (1) 52.5 (3) (2.0)
ClusTop-H2VF-NA − 59.5 (4) 87.5 (1) − 60.0 (1) 94.2 (2) − 20.7 (2) 54.8 (1) (1.8) − 92.4 (4) 145.3 (1) − 95.0 (1) 157.7 (2) − 20.7 (2) 54.8 (2) (2.0)
ClusTop-Word-AH − 311.9 (18) − 172.7 (19) − 480.2 (21) − 306.9 (21) − 177.0 (14) − 14.8 (8) (16.8) − 567.7 (17) − 336.0 (21) − 903.7 (21) − 562.7 (21) − 298.8 (14) − 50.2 (13) (17.8)
ClusTop-Hash-AH − 32.3 (1) − 4.2 (6) − 99.8 (3) − 16.3 (8) − 90.7 (5) 8.7 (5) (4.7) − 54.8 (1) − 9.5 (5) − 169.0 (3) − 22.2 (7) − 143.6 (6) 21.9 (5) (4.5)
ClusTop-Noun-AH − 309.9 (17) − 173.6 (20) − 444.0 (20) − 242.1 (19) − 131.8 (9) − 17.5 (9) (15.7) − 575.1 (18) − 316.4 (18) − 811.4 (20) − 425.6 (19) − 221.7 (12) − 35.0 (9) (16.0)
ClusTop-H2VG-AH − 166.4 (11) − 96.1 (14) − 304.6 (14) − 127.6 (17) − 143.0 (12) − 45.3 (13) (13.5) − 303.3 (12) − 166.6 (13) − 530.9 (14) − 195.1 (16) − 147.6 (8) − 46.9 (12) (12.5)
ClusTop-H2VW-AH − 167.4 (12) − 166.8 (18) − 281.9 (13) − 237.7 (18) − 116.9 (8) − 79.1 (15) (14.0) − 274.1 (10) − 268.2 (17) − 476.9 (12) − 390.7 (18) − 122.6 (5) − 82.5 (15) (12.8)
ClusTop-H2VF-AH − 201.1 (14) − 197.3 (21) − 350.3 (16) − 299.5 (20) − 134.5 (10) − 95.8 (16) (16.2) − 335.0 (14) − 329.3 (20) − 591.2 (16) − 494.6 (20) − 160.1 (10) − 115.3 (16) (16.0)
ClusTop-Word-AM − 326.6 (19) − 57.3 (11) − 278.7 (12) − 42.5 (10) − 422.5 (19) − 143.4 (19) (15.0) − 599.9 (19) − 155.4 (12) − 492.7 (13) − 122.3 (12) − 766.5 (19) − 239.6 (20) (15.8)
ClusTop-Hash-AM − 152.7 (9) − 17.5 (8) − 102.1 (4) − 1.8 (5) − 350.0 (16) − 50.1 (14) (9.3) − 262.2 (9) − 56.7 (8) − 169.8 (4) − 26.5 (8) − 615.2 (16) − 67.0 (14) (9.8)
ClusTop-Noun-AM − 60.7 (5) − 6.0 (7) − 150.6 (8) − 10.5 (6) − 138.9 (11) 29.7 (3) (6.7) − 99.3 (5) − 11.4 (7) − 257.4 (8) − 2.4 (5) − 227.4 (13) 58.0 (1) (6.5)
ClusTop-H2VG-AM − 273.6 (15) − 123.9 (16) − 193.0 (10) − 78.1 (13) − 152.4 (13) − 43.1 (12) (13.2) − 480.9 (15) − 202.2 (15) − 289.8 (9) − 103.9 (10) − 167.3 (11) − 46.4 (11) (11.8)
ClusTop-H2VW-AM − 95.4 (6) 4.1 (4) − 128.1 (7) 25.2 (3) − 89.3 (4) 5.5 (6) (5.0) − 146.2 (6) 9.5 (4) − 192.2 (6) 53.3 (3) − 102.6 (4) 12.2 (7) (5.0)
ClusTop-H2VF-AM − 96.2 (7) 4.3 (3) − 122.6 (6) 14.9 (4) − 108.3 (7) 4.1 (7) (5.7) − 149.5 (7) 11.3 (3) − 185.6 (5) 30.2 (4) − 146.6 (7) 13.0 (6) (5.3)
LDA-Orig − 722.5 (24) − 659.1 (24) − 695.0 (24) − 619.0 (24) − 593.5 (24) − 442.7 (24) (24.0) − 1279.6 (24) − 1133.6 (24) − 1262.1 (24) − 1102.0 (24) − 1083.2 (24) − 790.0 (24) (24.0)
LDA-Hash − 607.9 (22) − 435.4 (22) − 611.4 (22) − 449.2 (22) − 500.7 (22) − 271.5 (22) (22.0) − 1134.2 (22) − 794.2 (22) − 1120.9 (22) − 805.6 (22) − 930.7 (22) − 463.1 (22) (22.0)
LDA-Ment − 611.6 (23) − 465.7 (23) − 613.9 (23) − 477.7 (23) − 540.9 (23) − 316.9 (23) (23.0) − 1141.5 (23) − 850.2 (23) − 1130.6 (23) − 855.9 (23) − 981.9 (23) − 566.6 (23) (23.0)
  1. The rank of an algorithm’s performance for each metric are provided in brackets