Show simple item record

dc.contributor.authorBuch, Shyamal
dc.contributor.authorEscorcia, Victor
dc.contributor.authorGhanem, Bernard
dc.contributor.authorFei-Fei, Li
dc.contributor.authorNiebles, Juan Carlos
dc.date.accessioned2020-06-09T13:34:00Z
dc.date.available2020-06-09T13:34:00Z
dc.date.issued2019-05-01
dc.identifier.citationBuch, S., Escorcia, V., Ghanem, B., & Niebles, J. C. (2017). End-to-End, Single-Stream Temporal Action Detection in Untrimmed Videos. Procedings of the British Machine Vision Conference 2017. doi:10.5244/c.31.93
dc.identifier.isbn190172560X
dc.identifier.isbn9781901725605
dc.identifier.doi10.5244/c.31.93
dc.identifier.urihttp://hdl.handle.net/10754/663479
dc.description.abstractIn this work, we present a new intuitive, end-to-end approach for temporal action detection in untrimmed videos. We introduce our new architecture for Single-Stream Temporal Action Detection (SS-TAD), which effectively integrates joint action detection with its semantic sub-tasks in a single unifying end-to-end framework. We develop a method for training our deep recurrent architecture based on enforcing semantic constraints on intermediate modules that are gradually relaxed as learning progresses. We find that such a dynamic learning scheme enables SS-TAD to achieve higher overall detection performance, with fewer training epochs. By design, our single-pass network is very efficient and can operate at 701 frames per second, while simultaneously outperforming the state-of-the-art methods for temporal action detection on THUMOS’14.
dc.publisherBritish Machine Vision Association and Society for Pattern Recognition
dc.relation.urlhttp://www.bmva.org/bmvc/2017/papers/paper093/index.html
dc.rightsArchived with thanks to British Machine Vision Association
dc.titleEnd-to-end, single-stream temporal action detection in untrimmed videos
dc.typeConference Paper
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentElectrical Engineering Program
dc.contributor.departmentVCC Analytics Research Group
dc.conference.date2017-09-04 to 2017-09-07
dc.conference.name28th British Machine Vision Conference, BMVC 2017
dc.conference.locationLondon, GBR
dc.eprint.versionPost-print
dc.contributor.institutionStanford Vision and Learning Lab., Dept. of Computer Science, Stanford University, United States
kaust.personEscorcia, Victor
kaust.personGhanem, Bernard
dc.identifier.eid2-s2.0-85084013937


This item appears in the following Collection(s)

Show simple item record