Skip to main content

Table 3 The related work on disaster dataset development

From: Automatic analysis of social media images to identify disaster type and infer appropriate emergency response

Reference

Data source

Process

Domain

[26]

Flicker100K dataset

Wikimedia Common

Dataset size (3435 flood-related images and 275)

Collected the images

Defined the 3 tasks for relevant images

Assigned domain expert for images annotation on defined tasks

Validated the quality of annotation

Flood

[25]

Twitter

Dataset size (16,097 images)

The tweets with only images URL collected also more than one images

Discard all non-English language tweets. Discarded single word and single hashtag

All duplicate tweets are removed

The authors have developed annotation guidelines and,

provided them to well-known paid platform CrowdFlower

The manual annotation of text and images are attained separately

Earthquakes,

Hurricane,

Wildfires,

Floods

[9]

Flickr API

The Flickr API (flickr.photos.search) method is used to collect publicly assessable photos within a time period to avoid repetition in training and test data

The images are stored with metadata using MySQL database

The image which is retrieved by their geo-spatial data using flickr.photos.GetEXIF API method

Fire

[7]

Tweets embedded Instagram images

11,964 Instagram images

Searched 142,768 geo-located tweets sample embedded Instagram photos

The images were then manually coded to study separate disaster-related motif categories (ad, animals, damage, drink, food, gear, macro, other, outside, people, and relief). The images may fit multiple codes

Hurricane Sandy