Skip to main content

Table 2 Dataset statistics for each subtask

From: Hypert: hypernymy-aware BERT with Hearst pattern exploitation for hypernym discovery

Subtask

Corpus size

# of

Vocabulary

Train

Valid

Test

1A(English)

16G

218,753

1500

50

1500

2A(Medical)

800M

93,888

500

15

500

2B(Music)

500M

69,118

500

15

500