Show simple item record

dc.contributor.authorAchlioptas, Panos
dc.contributor.authorOvsjanikov, Maks
dc.contributor.authorHaydarov, Kilichbek
dc.contributor.authorElhoseiny, Mohamed
dc.contributor.authorGuibas, Leonidas
dc.date.accessioned2021-04-15T08:46:35Z
dc.date.available2021-04-15T08:46:35Z
dc.date.issued2021-01-19
dc.identifier.urihttp://hdl.handle.net/10754/668789
dc.description.abstractWe present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. In contrast to most existing annotation datasets in computer vision, we focus on the affective experience triggered by visual artworks and ask the annotators to indicate the dominant emotion they feel for a given image and, crucially, to also provide a grounded verbal explanation for their emotion choice. As we demonstrate below, this leads to a rich set of signals for both the objective content and the affective impact of an image, creating associations with abstract concepts (e.g., "freedom" or "love"), or references that go beyond what is directly visible, including visual similes and metaphors, or subjective references to personal experiences. We focus on visual art (e.g., paintings, artistic photographs) as it is a prime example of imagery created to elicit emotional responses from its viewers. Our dataset, termed ArtEmis, contains 439K emotion attributions and explanations from humans, on 81K artworks from WikiArt. Building on this data, we train and demonstrate a series of captioning systems capable of expressing and explaining emotions from visual stimuli. Remarkably, the captions produced by these systems often succeed in reflecting the semantic and abstract content of the image, going well beyond systems trained on existing datasets. The collected dataset and developed methods are available at https://artemisdataset.org.
dc.description.sponsorshipThis work is funded by a Vannevar Bush Faculty Fellowship, a KAUST BAS/1/1685-01-01, a CRG-2017-3426, the ERC Starting Grant No. 758800 (EXPROTEA) and the ANR AI Chair AIGRETTE, and gifts from the Adobe, Amazon AWS, Autodesk, and Snap corporations. The authors wish to thank Fei Xia and Jan Dombrowski for their help with the AMT instruction design and Nikos Gkanatsios for several fruitful discussions. The authors also want to emphasize their gratitude to all the hard working Amazon Mechanical Turkers without whom this work would not be possible.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/pdf/2101.07396.pdf
dc.rightsArchived with thanks to arXiv
dc.titleArtEmis: Affective Language for Visual Art
dc.typePreprint
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.eprint.versionPre-print
dc.contributor.institutionStanford University.
dc.contributor.institutionLIX, Ecole Polytechnique, IP Paris.
dc.identifier.arxivid2101.07396
kaust.personHaydarov, Kilichbek
kaust.personElhoseiny, Mohamed
kaust.grant.numberCRG-2017-3426
kaust.grant.numberBAS/1/1685-01-01
refterms.dateFOA2021-04-15T08:47:28Z
kaust.acknowledged.supportUnitBAS
kaust.acknowledged.supportUnitCRG


Files in this item

Thumbnail
Name:
Preprintfile1.pdf
Size:
9.742Mb
Format:
PDF
Description:
Pre-print

This item appears in the following Collection(s)

Show simple item record