Semantic Image-Text-Classes

This dataset is introduced by the paper "Understanding, Categorizing and Predicting Semantic Image-Text Relations".

If you are using this dataset it in your work, please cite:

@inproceedings{otto2019understanding, title={Understanding, Categorizing and Predicting Semantic Image-Text Relations}, author={Otto, Christian and Springstein, Matthias and Anand, Avishek and Ewerth, Ralph}, booktitle={In Proceedings of ACM International Conference on Multimedia Retrieval (ICMR 2019)}, year={2019} }

To create the full tar use the following command in the command line:

cat train.tar.part* > train_concat.tar

Then simply untar it via

tar -xf train_concat.tar

The jsonl files contain metadata of the following format:

id, origin, CMI, SC, STAT, ITClass, text, tagged text, image_path

License Information:

This dataset is composed of various open access sources as described in the paper. We thank all the original authors for their work.

Daten und Ressourcen

Cite this as

Christian Otto, Matthias Springstein, Avishek Anand, Ralph Ewerth (2019). Semantic Image-Text-Classes [Data set]. LUIS. https://doi.org/10.25835/0010577
Retrieved: 04:33 16 May 2026 (UTC)

Zusätzliche Informationen

Feld Wert
Autor Christian Otto, Matthias Springstein, Avishek Anand, Ralph Ewerth
Verantwortlicher Christian Otto
Version 1.0
Zuletzt aktualisiert Januar 20, 2022, 14:14 (UTC)
Erstellt April 23, 2019, 10:18 (UTC)
Lizenz Creative Commons Attribution-NonCommercial 3.0
Dataset Size 45.5 GByte