Punny Captions: Witty Wordplay in Image Descriptions

Institute Homepage

Institute Homepage EN Sign In

Back

Conference Paper 2018

paper

Perzeptive Systeme

Arjun Chandrasekaran

Guest Scientist

Wit is a form of rich interaction that is often grounded in a specific situation (e.g., a comment in response to an event). In this work, we attempt to build computational models that can produce witty descriptions for a given image. Inspired by a cognitive account of humor appreciation, we employ linguistic wordplay, specifically puns, in image descriptions. We develop two approaches which involve retrieving witty descriptions for a given image from a large corpus of sentences, or generating them via an encoder-decoder neural network architecture. We compare our approach against meaningful baseline approaches via human studies and show substantial improvements. We find that when a human is subject to similar constraints as the model regarding word usage and style, people vote the image descriptions generated by our model to be slightly wittier than human-written witty descriptions. Unsurprisingly, humans are almost always wittier than the model when they are free to choose the vocabulary, style, etc.

Author(s):	Arjun Chandrasekaran and Devi Parikh and Mohit Bansal
Links:	paper
Book Title:	Proceedings Conf. on Empirical Methods in Natural Language Processing (EMNLP),
Year:	2018
Month:	June

Bibtex Type:	Conference Paper (inproceedings)

Event Name:	EMNLP

Electronic Archiving:	grant_archive

BibTex

@inproceedings{Chandrasekaran:EMNLP:2018,
  title = {Punny Captions: Witty Wordplay in Image Descriptions},
  booktitle = {Proceedings Conf.~on Empirical Methods in Natural Language Processing (EMNLP),},
  abstract = {Wit is a form of rich interaction that is often grounded in a specific situation (e.g., a comment in response to an event). In this work, we attempt to build computational models that can produce witty descriptions for a given image. Inspired by a cognitive account of humor appreciation, we employ linguistic wordplay, specifically puns, in image descriptions. We develop two approaches which involve retrieving witty descriptions for a given image from a large corpus of sentences, or generating them via an encoder-decoder neural network architecture. We compare our approach against meaningful baseline approaches via human studies and show substantial improvements. We find that when a human is subject to similar constraints as the model regarding word usage and style, people vote the image descriptions generated by our model to be slightly wittier than human-written witty descriptions. Unsurprisingly, humans are almost always wittier than the model when they are free to choose the vocabulary, style, etc.},
  month = jun,
  year = {2018},
  slug = {punnycap-emnlp-2018},
  author = {Chandrasekaran, Arjun and Parikh, Devi and Bansal, Mohit},
  month_numeric = {6}
}

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

BibTex