Type
Conference PaperKAUST Department
Electrical Engineering ProgramElectrical Engineering
Physical Science and Engineering (PSE) Division
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Visual Computing Center (VCC)
King Abdullah University of Science and Technology (KAUST), Saudi Arabia
KAUST Grant Number
OSR-CRG2017-3405Date
2020Permanent link to this record
http://hdl.handle.net/10754/660742
Metadata
Show full item recordAbstract
Temporal action detection is a fundamental yet challenging task in video understanding. Video context is a critical cue to effectively detect actions, but current works mainly focus on temporal context, while neglecting semantic context as well as other important context properties. In this work, we propose a graph convolutional network (GCN) model to adaptively incorporate multi-level semantic context into video features and cast temporal action detection as a sub-graph localization problem. Specifically, we formulate video snippets as graph nodes, snippet-snippet correlations as edges, and actions associated with context as target sub-graphs. With graph convolution as the basic operation, we design a GCN block called GCNeXt, which learns the features of each node by aggregating its context and dynamically updates the edges in the graph. To localize each sub-graph, we also design an SGAlign layer to embed each sub-graph into the Euclidean space. Extensive experiments show that G-TAD is capable of finding effective video context without extra supervision and achieves state-of-the-art performance on two detection benchmarks. On ActivityNet-1.3 it obtains an average mAP of 34.09%; on THUMOS14 it reaches 51.6% at IoU@0.5 when combined with a proposal processing method. The code has been made available at https://github.com/frostinassiky/gtad.Citation
Xu, M., Zhao, C., Rojas, D. S., Thabet, A., & Ghanem, B. (2020). G-TAD: Sub-Graph Localization for Temporal Action Detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr42600.2020.01017Sponsors
This work was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. OSR-CRG2017-3405.Conference/Event name
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)ISBN
978-1-7281-7169-2arXiv
1911.11462Additional Links
https://ieeexplore.ieee.org/document/9157091/https://openaccess.thecvf.com/content_CVPR_2020/html/Xu_G-TAD_Sub-Graph_Localization_for_Temporal_Action_Detection_CVPR_2020_paper.html
https://arxiv.org/abs/1911.11462
Relations
Is Supplemented By:- [Software]
Title: frostinassiky/gtad: The official implementation of G-TAD: Sub-Graph Localization for Temporal Action Detection. Publication Date: 2019-11-28. github: frostinassiky/gtad Handle: 10754/668117
ae974a485f413a2113503eed53cd6c53
10.1109/CVPR42600.2020.01017