Skip to main content

Table 2 Datasets used in the evaluation

From: SemLinker: automating big data integration for casual users

Source

Type

Format

#Attr.

#Size

#Evol.

HAR-1 [66]

Scien.

CSV

10

3,540,962

0

HAR-2 [66]

Scien.

CSV

10

3,205,431

0

HAR-3 [57]

Scien.

CSV

4

200,471

0

Facebook [58]

Social

JSON

17

19,770

4

Twitter [59]

Social

JSON

19

169,000

2

Foursquare [60]

Social

JSON

17

15,712

2

Flickr [61]

Social

XML

10

20,000

3

TripAdvisor [62]

Social

Spreads

13

19,998

4

Tourpedia [63]

Social

JSON

7

115,732

3

EnglandPubs [64]

Public

CSV

9

51,566

4

OpenPostCode [65]

Public

CSV

7

2,525,575

1

  1. #Attr: Number of attributes; #Evol: Number of Schema Evolutions