Multi30K: multilingual English-German image descriptions

File Description SizeFormat 
1605.00459v1.pdfWorking paper291.9 kBAdobe PDFView/Open
Title: Multi30K: multilingual English-German image descriptions
Authors: Elliott, D
Frank, S
Sima'an, K
Specia, L
Item Type: Working Paper
Abstract: We introduce the Multi30K dataset to stimulate multilingual multimodal research. Recent advances in image description have been demonstrated on English-language datasets almost exclusively, but image description should not be limited to English. This dataset extends the Flickr30K dataset with i) German translations created by professional translators over a subset of the English descriptions, and ii) descriptions crowdsourced independently of the original English descriptions. We outline how the data can be used for multilingual image description and multimodal machine translation, but we anticipate the data will be useful for a broader range of tasks.
Issue Date: 2-May-2016
Publisher: arXiv
Copyright Statement: © 2016 The Author(s).
Keywords: cs.CL
Publication Status: Published
Appears in Collections:Faculty of Engineering

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commonsx