ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding

Handle URI:
http://hdl.handle.net/10754/556141
Title:
ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding
Authors:
Heilbron, Fabian Caba; Castillo, Victor; Ghanem, Bernard ( 0000-0002-5534-587X ) ; Niebles, Juan Carlos
Abstract:
In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize. This is in part due to the simplicity of current benchmarks, which mostly focus on simple actions and movements occurring on manually trimmed videos. In this paper we introduce ActivityNet, a new largescale video benchmark for human activity understanding. Our benchmark aims at covering a wide range of complex human activities that are of interest to people in their daily living. In its current version, ActivityNet provides samples from 203 activity classes with an average of 137 untrimmed videos per class and 1.41 activity instances per video, for a total of 849 video hours. We illustrate three scenarios in which ActivityNet can be used to compare algorithms for human activity understanding: untrimmed video classification, trimmed activity classification and activity detection.
KAUST Department:
Image and Video Understanding Lab
Publisher:
IEEE
Journal:
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Conference/Event name:
2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Issue Date:
2-Jun-2015
Type:
Conference Paper
Sponsors:
IEEE Computer Society, Computer Vision Foundation - CVF
Additional Links:
https://dl.dropboxusercontent.com/u/18955644/website_files/ActivityNet/ActivityNet_CVPR2015.pdf; https://dl.dropboxusercontent.com/u/18955644/website_files/ActivityNet/ActivityNet_CVPR2015_supp_material.zip
Appears in Collections:
Conference Papers

Full metadata record

DC FieldValue Language
dc.contributor.authorHeilbron, Fabian Cabaen
dc.contributor.authorCastillo, Victoren
dc.contributor.authorGhanem, Bernarden
dc.contributor.authorNiebles, Juan Carlosen
dc.date.accessioned2015-06-02T13:52:50Zen
dc.date.available2015-06-02T13:52:50Zen
dc.date.issued2015-06-02en
dc.identifier.urihttp://hdl.handle.net/10754/556141en
dc.description.abstractIn spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize. This is in part due to the simplicity of current benchmarks, which mostly focus on simple actions and movements occurring on manually trimmed videos. In this paper we introduce ActivityNet, a new largescale video benchmark for human activity understanding. Our benchmark aims at covering a wide range of complex human activities that are of interest to people in their daily living. In its current version, ActivityNet provides samples from 203 activity classes with an average of 137 untrimmed videos per class and 1.41 activity instances per video, for a total of 849 video hours. We illustrate three scenarios in which ActivityNet can be used to compare algorithms for human activity understanding: untrimmed video classification, trimmed activity classification and activity detection.en
dc.description.sponsorshipIEEE Computer Society, Computer Vision Foundation - CVFen
dc.publisherIEEEen
dc.relation.urlhttps://dl.dropboxusercontent.com/u/18955644/website_files/ActivityNet/ActivityNet_CVPR2015.pdfen
dc.relation.urlhttps://dl.dropboxusercontent.com/u/18955644/website_files/ActivityNet/ActivityNet_CVPR2015_supp_material.zipen
dc.rights(c) 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.en
dc.subjectAction Recognitionen
dc.subjectLarge Scaleen
dc.titleActivityNet: A Large-Scale Video Benchmark for Human Activity Understandingen
dc.typeConference Paperen
dc.contributor.departmentImage and Video Understanding Laben
dc.identifier.journalProceedings of the IEEE Conference on Computer Vision and Pattern Recognitionen
dc.conference.date07 Jun - 12 Jun 2015en
dc.conference.name2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)en
dc.conference.locationHynes Convention Center 900 Boylston St Boston, MA, USAen
dc.eprint.versionPost-printen
dc.contributor.institutionUniversidad del Norte, Colombiaen
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.