Journal of Big Data

Table 2 Dataset statistics for each subtask

From: Hypert: hypernymy-aware BERT with Hearst pattern exploitation for hypernym discovery

Subtask	Corpus size	# of
Subtask	Corpus size	Vocabulary	Train	Valid	Test
1A(English)	16G	218,753	1500	50	1500
2A(Medical)	800M	93,888	500	15	500
2B(Music)	500M	69,118	500	15	500

Back to article page