KAUST DepartmentElectrical Engineering Program
Physical Science and Engineering (PSE) Division
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Visual Computing Center (VCC)
King Abdullah University of Science and Technology (KAUST), Saudi Arabia
KAUST Grant NumberOSR-CRG2017-3405
Permanent link to this recordhttp://hdl.handle.net/10754/660742
MetadataShow full item record
AbstractTemporal action detection is a fundamental yet challenging task in video understanding. Video context is a critical cue to effectively detect actions, but current works mainly focus on temporal context, while neglecting semantic context as well as other important context properties. In this work, we propose a graph convolutional network (GCN) model to adaptively incorporate multi-level semantic context into video features and cast temporal action detection as a sub-graph localization problem. Specifically, we formulate video snippets as graph nodes, snippet-snippet correlations as edges, and actions associated with context as target sub-graphs. With graph convolution as the basic operation, we design a GCN block called GCNeXt, which learns the features of each node by aggregating its context and dynamically updates the edges in the graph. To localize each sub-graph, we also design an SGAlign layer to embed each sub-graph into the Euclidean space. Extensive experiments show that G-TAD is capable of finding effective video context without extra supervision and achieves state-of-the-art performance on two detection benchmarks. On ActivityNet-1.3 it obtains an average mAP of 34.09%; on THUMOS14 it reaches 51.6% at IoU@0.5 when combined with a proposal processing method. The code has been made available at https://github.com/frostinassiky/gtad.
CitationXu, M., Zhao, C., Rojas, D. S., Thabet, A., & Ghanem, B. (2020). G-TAD: Sub-Graph Localization for Temporal Action Detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr42600.2020.01017
SponsorsThis work was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. OSR-CRG2017-3405.
Conference/Event name2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
RelationsIs Supplemented By: