MEDIC Dataset

Data


The MEDIC is the largest multi-task learning disaster related dataset, which is an extended version of the crisis image benchmark dataset. It consists data from several data sources such as CrisisMMD, data from AIDR and Damage Multimodal Dataset (DMD). The dataset contains 71,198 images.

Table of Contents:

 

Data format and directories


Directories


Format of the TSV file


Disaster response tasks


  1. Disaster types
    • Earthquake
    • Fire
    • Flood
    • Hurricane
    • Landslide
    • Not disaster
    • Other disaster
  2. Informativeness
    • Informative
    • Not informative
  3. Humanitarian categories
    • Affected, injured, or dead people
    • Infrastructure and utility damage
    • Not humanitarian
    • Rescue volunteering or donation effort
  4. Damage severity assesment
    • Little or no damage
    • Mild damage
    • Severe damage


Downloads:


Please cite the following papers, if you use this dataset in your research.

  1. Firoj Alam, Tanvirul Alam, Md. Arid Hasan, Abul Hasnat, Muhammad Imran, Ferda Ofli, MEDIC: A Multi-Task Learning Dataset for Disaster Image Classification, 2021. [Bibtex] [Arxiv]
  2. Firoj Alam, Ferda Ofli, Muhammad Imran, Tanvirul Alam, Umair Qazi, Deep Learning Benchmarks and Datasets for Social Media Image Classification for Disaster Response, In 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2020.
  3. Firoj Alam, Ferda Ofli, and Muhammad Imran, CrisisMMD: Multimodal Twitter Datasets from Natural Disasters. In Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM), 2018, Stanford, California, USA.
  4. Hussein Mozannar, Yara Rizk, and Mariette Awad, Damage Identification in Social Media Posts using Multimodal Deep Learning, In Proc. of ISCRAM, May 2018, pp. 529–543.
  5. Dat Tien Nguyen, Ferda Ofli, Muhammad Imran, and Prasenjit Mitra, Damage assessment from social418media imagery data during disasters. In Proc. of ASONAM, pages 1–8, Aug 2017.

License


The MEDIC dataset is published under CC BY-NC-SA 4.0 license, which means everyone can use this dataset for non-commercial research purpose: https://creativecommons.org/licenses/by-nc/4.0/.

Terms of Use

Please see Terms of Use