Workshop On Vision And Language 2014 (VL'14), Dublin, 23rd August 2014

The 3rd Annual Meeting Of The EPSRC Network On Vision & Language and The 1st Technical Meeting of the European Network on Integrating Vision and Language

A Workshop of the 25th International Conference on Computational Linguistics (COLING 2014)

Invited speaker: Alex Jaimes, Yahoo Inc.

Accepted papers have now been posted!

Registration is now open!

Call for Participation and Poster abstracts

Microsoft Word - CfP.docx

Fragments of natural language, in the form of tags, captions, subtitles, surrounding text or audio, can aid the interpretation of image and video data by adding context or disambiguating visual appearance. In addition, labelled images are essential for training object or activity classifiers. On the other hand, visual data can help resolve challenges in language processing such as word sense disambiguation. Studying language and vision together can also provide new insight into cognition and universal representations of knowledge and meaning. Meanwhile, sign language and gestures are languages that require visual interpretation.

We welcome papers describing original research combining language and vision. To encourage the sharing of novel and emerging ideas we also welcome papers describing new data-sets, grand challenges, open problems, benchmarks and work in progress as well as survey papers.

Microsoft Word - CfP.docx
Topics of interest include (but are not limited to):

- Image and video labelling and annotation
- Image and video description                              
- Computational modelling of human vision and language
- Image and video retrieval
- Mulitmodal human-computer communication
- Automatic text illustration
- Language-driven animation
- Facial animation from speech
- Assistive methodologies
- Text-to-image generation


poster abstract Submission

In addition to the long papers to be presented at the VL’14 Workshop, we now invite abstracts for posters to be presented at the VL’14 Poster Session. Abstracts will also be included in the proceedings.

We invite 2-page abstracts describing original research combining language and vision. To encourage the sharing of novel and emerging ideas we also welcome abstracts describing new data-sets, grand challenges, open problems, benchmarks and work in progress. Submissions should adhere to the COLING 2014 format (style files available from, and should be in PDF format.

Please make your submission via the START workshop submission pages:

Important Dates

Poster Abstracts Deadline
10th July 2014
Camera-Ready Full Paper Deadline (for existing accepted papers)
11th July 2014
Poster Abstracts Author Notifications
12th July 2014
Camera-Ready Poster Abstracts Deadline 16th July 2014
Date of Workshop
23th August 2014


Anja Belz, University of Brighton
Darren Cosker, University of Bath
Frank Keller, University of Edinburgh
William Smith, University of York
Kalina Bontcheva, University of Sheffield
Sien Moens, University of Leuven
Alan Smeaton, Dublin City University

ProgrammE Committee

Yannis Aloimonos, University of Maryland, US
Dimitrios Makris, Kingston University, UK
Desmond Elliot, University of Edinburgh, UK
Tamara Berg, Stony Brook, US
Claire Gardent, CNRS/LORIA, France
Lewis Griffin, UCL, UK
Brian Mac Namee, Dublin Institute of Technology, Ireland
Margaret Mitchell, University of Aberdeen, UK
Ray Mooney, University of Texas at Austin, US
Chris Town, University of Cambridge, UK
David Windridge, University of Surrey, UK
Lucia Specia, University of Sheffield, UK
John Kelleher, Dublin Institute of Technology, Ireland
Sergio Escalera, Autonomous University of Barcelona, Spain
Erkut Erdem, Hacettepe University, Turkey
Isabel Trancoso, INESC-ID, Portugal
Julia Hockenmaier, University of Illinois
Jordi Gonzales, Universita Autonoma de Barcelona



The EPSRC Network on Vision and Language (V&L Net) is a forum for researchers from the fields of Computer Vision and Language Processing to meet, exchange ideas, expertise and technology, and form new partnerships. Our aim is to create a lasting interdisciplinary research community situated at the language- vision interface, jointly working towards solutions for some of today's toughest computational challenges, including image and video search, description of visual content and text-to-image generation.


The explosive growth of visual and textual data (both on the World Wide Web and held in private repositories by diverse institutions and companies) has led to urgent requirements in terms of search, processing and management of digital content. Solutions for providing access to or mining such data depend on the semantic gap between vision and language being bridged, which in turn calls for expertise from two so far unconnected fields: Computer Vision (CV) and Natural Language Processing (NLP). The central goal of iV&L Net is to build a European CV/NLP research community, targeting 4 focus themes: (i) Integrated Modelling of Vision and Language for CV and NLP Tasks; (ii) Applications of Integrated Models; (iii) Automatic Generation of Image & Video Descriptions; and (iv) Semantic Image & Video Search. iV&L Net will organise annual conferences, technical meetings, partner visits, data/task benchmarking, and industry/end-user liaison. Europe has many of the world’s leading CV and NLP researchers. Tapping into this expertise, and bringing the collaboration, networking and community building enabled by COST Actions to bear, iV&L Net will have substantial impact, in terms of advances in both theory/methodology and real world technologies.