Skip to main content

Table 15 Comparison of evaluation measures on the best Hypert and BERT models for each subtask 

From: Hypert: hypernymy-aware BERT with Hearst pattern exploitation for hypernym discovery

Subtask

Evaluation measures

Hypert model

BERT model

1A

English

MRR

38.67 ± 1.59

36.44 ± 2.12

MAP

24.46 ± 0.89

23.29 ± 0.78

P@1

29.38 ± 1.82

26.38 ± 2.93

P@3

21.91 ± 1.08

20.90 ± 1.00

P@5

21.73 ± 0.80

20.68 ± 0.79

P@15

27.80 ± 0.91

26.63 ± 0.75

2A

Medical

MRR

66.52 ± 2.44

62.62 ± 3.20

MAP

50.48 ± 1.63

48.85 ± 1.57

P@1

55.64 ± 3.43

49.94 ± 4.35

P@3

46.97 ± 1.56

45.45 ± 2.27

P@5

46.22 ± 1.86

45.66 ± 1.74

P@15

54.80 ± 1.69

53.36 ± 0.97

2B

Music

MRR

67.43 ± 2.37

63.19 ± 5.38

MAP

55.03 ± 1.98

49.70 ± 3.37

P@1

56.68 ± 2.98

50.92 ± 7.31

P@3

52.94 ± 2.05

46.88 ± 4.28

P@5

52.92 ± 2.37

47.27 ± 3.59

P@15

58.59 ± 2.05

53.97 ± 2.48

  1. Bold face indicates the best performance between two models