Show simple item record

dc.contributor.authorPardo, Alejandro
dc.contributor.authorHeilbron, Fabian Caba
dc.contributor.authorAlcázar, Juan León
dc.contributor.authorThabet, Ali Kassem
dc.contributor.authorGhanem, Bernard
dc.date.accessioned2021-09-15T06:30:59Z
dc.date.available2021-09-15T06:30:59Z
dc.date.issued2021-09-12
dc.identifier.urihttp://hdl.handle.net/10754/671220
dc.description.abstractUnderstanding movies and their structural patterns is a crucial task to decode the craft of video editing. While previous works have developed tools for general analysis such as detecting characters or recognizing cinematography properties at the shot level, less effort has been devoted to understanding the most basic video edit, the Cut. This paper introduces the cut type recognition task, which requires modeling of multi-modal information. To ignite research in the new task, we construct a large-scale dataset called MovieCuts, which contains more than 170K videoclips labeled among ten cut types. We benchmark a series of audio-visual approaches, including some that deal with the problem's multi-modal and multi-label nature. Our best model achieves 45.7% mAP, which suggests that the task is challenging and that attaining highly accurate cut type recognition is an open research problem.
dc.description.sponsorshipThis work was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research through the Visual Computing Center (VCC) funding.
dc.publisherarXiv
dc.relation.urlhttps://arxiv.org/pdf/2109.05569.pdf
dc.rightsArchived with thanks to arXiv
dc.titleMovieCuts: A New Dataset and Benchmark for Cut Type Recognition
dc.typePreprint
dc.contributor.departmentKing Abdullah University of Science and Technology (KAUST)
dc.contributor.departmentVisual Computing Center (VCC)
dc.contributor.departmentComputer, Electrical and Mathematical Science and Engineering (CEMSE) Division
dc.contributor.departmentElectrical and Computer Engineering Program
dc.eprint.versionPre-print
dc.contributor.institutionAdobe Research
dc.identifier.arxivid2109.05569
kaust.personPardo, Alejandro
kaust.personAlcázar, Juan León
kaust.personThabet, Ali Kassem
kaust.personGhanem, Bernard
refterms.dateFOA2021-09-15T06:32:34Z
kaust.acknowledged.supportUnitOffice of Sponsored Research
kaust.acknowledged.supportUnitVisual Computing Center (VCC)


Files in this item

Thumbnail
Name:
Preprintfile1.pdf
Size:
13.33Mb
Format:
PDF
Description:
Pre-print

This item appears in the following Collection(s)

Show simple item record