Automatic analysis of social media images to identify disaster type and infer appropriate emergency response

Journal of Big Data

Table 3 The related work on disaster dataset development

Reference	Data source	Process	Domain
[26]	Flicker100K dataset Wikimedia Common Dataset size (3435 flood-related images and 275)	Collected the images Defined the 3 tasks for relevant images Assigned domain expert for images annotation on defined tasks Validated the quality of annotation	Flood
[25]	Twitter Dataset size (16,097 images)	The tweets with only images URL collected also more than one images Discard all non-English language tweets. Discarded single word and single hashtag All duplicate tweets are removed The authors have developed annotation guidelines and, provided them to well-known paid platform CrowdFlower The manual annotation of text and images are attained separately	Earthquakes, Hurricane, Wildfires, Floods
[9]	Flickr API	The Flickr API (flickr.photos.search) method is used to collect publicly assessable photos within a time period to avoid repetition in training and test data The images are stored with metadata using MySQL database The image which is retrieved by their geo-spatial data using flickr.photos.GetEXIF API method	Fire
[7]	Tweets embedded Instagram images 11,964 Instagram images	Searched 142,768 geo-located tweets sample embedded Instagram photos The images were then manually coded to study separate disaster-related motif categories (ad, animals, damage, drink, food, gear, macro, other, outside, people, and relief). The images may fit multiple codes	Hurricane Sandy