Skip to main content

Table 1 Short description of the dataset

From: The performance of BERT as data representation of text clustering

Dataset

Description

Number of class

Total number of data

AG News

Contains news titles and content from AG News media categorized by news topics

4

4000

Yahoo! Answers

Contains questions asked on Yahoo! Answers along with their answers which are categorized based on the topic of the question

10

10000

Reuters

Contains documents extracted from Reuters-21578, which is data containing news documents from the Reuters mass media in 1987

2

5859