Skip to main content

Table 1 Statistics of the CNN and DUC datasets

From: EXABSUM: a new text summarization approach for generating extractive and abstractive summaries

Dataset

Number of clusters

Domain’s documents

Number of Documents

Sentences

Number of test documents

Avg. length

(Model sum)

Task

DUC01

30

Multi-Domain

309

269,990

309

100

Single and Multi

DUC02

59

Multi-Domain

567

348,012

567

100

Single and Multi

CNN

0

Multi-Domain

3000

2,628,336

2000

90

Single