KAUST DepartmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Electrical Engineering Program
Visual Computing Center (VCC)
Online Publication Date2017-11-09
Print Publication Date2017-07
Permanent link to this recordhttp://hdl.handle.net/10754/626797
MetadataShow full item record
AbstractOur paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. We introduce Single-Stream Temporal Action Proposals (SST), a new effective and efficient deep architecture for the generation of temporal action proposals. Our network can run continuously in a single stream over very long input video sequences, without the need to divide input into short overlapping clips or temporal windows for batch processing. We demonstrate empirically that our model outperforms the state-of-the-art on the task of temporal action proposal generation, while achieving some of the fastest processing speeds in the literature. Finally, we demonstrate that using SST proposals in conjunction with existing action classifiers results in improved state-of-the-art temporal action detection performance.
CitationBuch S, Escorcia V, Shen C, Ghanem B, Niebles JC (2017) SST: Single-Stream Temporal Action Proposals. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Available: http://dx.doi.org/10.1109/cvpr.2017.675.
SponsorsThis research was sponsored, in part, by the Stanford AI Lab-Toyota Center for Artificial Intelligence Research, Toyota Research Institute (TRI), and by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research. This article reflects the opinions and conclusions of its authors and not TRI or any other Toyota entity. We thank our anonymous reviewers, De-An Huang, Oliver Groth, Fabian Caba, Joseph Lim, Jingwei Ji, and Fei-Fei Li for helpful comments and discussion.