63
IRUS Total
Downloads
  Altmetric

The role of image representations in vision to language tasks

File Description SizeFormat 
paper.pdfAccepted version3.06 MBAdobe PDFView/Open
Title: The role of image representations in vision to language tasks
Authors: Madhyastha, P
Wang, J
Specia, L
Item Type: Journal Article
Abstract: Tasks that require modeling of both language and visual information, such as image captioning, have become very popular in recent years. Most state-of-the-art approaches make use of image representations obtained from a deep neural network, which are used to generate language information in a variety of ways with end-to-end neural-network-based models. However, it is not clear how different image representations contribute to language generation tasks. In this paper, we probe the representational contribution of the image features in an end-to-end neural modeling framework and study the properties of different types of image representations. We focus on two popular vision to language problems: The task of image captioning and the task of multimodal machine translation. Our analysis provides interesting insights into the representational properties and suggests that end-to-end approaches implicitly learn a visual-semantic subspace and exploit the subspace to generate captions.
Issue Date: 1-May-2018
Date of Acceptance: 18-Feb-2018
URI: http://hdl.handle.net/10044/1/63807
DOI: https://dx.doi.org/10.1017/S1351324918000116
ISSN: 1351-3249
Publisher: Cambridge University Press (CUP)
Start Page: 415
End Page: 439
Journal / Book Title: Natural Language Engineering
Volume: 24
Issue: 3
Copyright Statement: © 2018 Cambridge University Press. This paper has been accepted for publication and will appear in a revised form, subsequent to peer-review and/or editorial input by Cambridge University Press.
Keywords: Science & Technology
Social Sciences
Technology
Computer Science, Artificial Intelligence
Linguistics
Language & Linguistics
Computer Science
MODELS
0801 Artificial Intelligence And Image Processing
1702 Cognitive Science
2004 Linguistics
Artificial Intelligence & Image Processing
Publication Status: Published
Online Publication Date: 2018-03-21
Appears in Collections:Computing
Faculty of Engineering