Robust visual tracking via structured multi-task sparse learning

Handle URI:
http://hdl.handle.net/10754/562405
Title:
Robust visual tracking via structured multi-task sparse learning
Authors:
Zhang, Tianzhu; Ghanem, Bernard ( 0000-0002-5534-587X ) ; Liu, Si; Ahuja, Narendra
Abstract:
In this paper, we formulate object tracking in a particle filter framework as a structured multi-task sparse learning problem, which we denote as Structured Multi-Task Tracking (S-MTT). Since we model particles as linear combinations of dictionary templates that are updated dynamically, learning the representation of each particle is considered a single task in Multi-Task Tracking (MTT). By employing popular sparsity-inducing lp,q mixed norms (specifically p∈2,∞ and q=1), we regularize the representation problem to enforce joint sparsity and learn the particle representations together. As compared to previous methods that handle particles independently, our results demonstrate that mining the interdependencies between particles improves tracking performance and overall computational complexity. Interestingly, we show that the popular L1 tracker (Mei and Ling, IEEE Trans Pattern Anal Mach Intel 33(11):2259-2272, 2011) is a special case of our MTT formulation (denoted as the L11 tracker) when p=q=1. Under the MTT framework, some of the tasks (particle representations) are often more closely related and more likely to share common relevant covariates than other tasks. Therefore, we extend the MTT framework to take into account pairwise structural correlations between particles (e.g. spatial smoothness of representation) and denote the novel framework as S-MTT. The problem of learning the regularized sparse representation in MTT and S-MTT can be solved efficiently using an Accelerated Proximal Gradient (APG) method that yields a sequence of closed form updates. As such, S-MTT and MTT are computationally attractive. We test our proposed approach on challenging sequences involving heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that S-MTT is much better than MTT, and both methods consistently outperform state-of-the-art trackers. © 2012 Springer Science+Business Media New York.
KAUST Department:
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division; Electrical Engineering Program; Visual Computing Center (VCC); VCC Analytics Research Group
Publisher:
Springer Nature
Journal:
International Journal of Computer Vision
Issue Date:
9-Nov-2012
DOI:
10.1007/s11263-012-0582-z
Type:
Article
ISSN:
09205691
Sponsors:
This study is supported by the research grant for the Human Sixth Sense Programme at the Advanced Digital Sciences Center from Singapore's Agency for Science, Technology and Research (A*STAR).
Appears in Collections:
Articles; Electrical Engineering Program; Visual Computing Center (VCC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

Full metadata record

DC FieldValue Language
dc.contributor.authorZhang, Tianzhuen
dc.contributor.authorGhanem, Bernarden
dc.contributor.authorLiu, Sien
dc.contributor.authorAhuja, Narendraen
dc.date.accessioned2015-08-03T10:37:01Zen
dc.date.available2015-08-03T10:37:01Zen
dc.date.issued2012-11-09en
dc.identifier.issn09205691en
dc.identifier.doi10.1007/s11263-012-0582-zen
dc.identifier.urihttp://hdl.handle.net/10754/562405en
dc.description.abstractIn this paper, we formulate object tracking in a particle filter framework as a structured multi-task sparse learning problem, which we denote as Structured Multi-Task Tracking (S-MTT). Since we model particles as linear combinations of dictionary templates that are updated dynamically, learning the representation of each particle is considered a single task in Multi-Task Tracking (MTT). By employing popular sparsity-inducing lp,q mixed norms (specifically p∈2,∞ and q=1), we regularize the representation problem to enforce joint sparsity and learn the particle representations together. As compared to previous methods that handle particles independently, our results demonstrate that mining the interdependencies between particles improves tracking performance and overall computational complexity. Interestingly, we show that the popular L1 tracker (Mei and Ling, IEEE Trans Pattern Anal Mach Intel 33(11):2259-2272, 2011) is a special case of our MTT formulation (denoted as the L11 tracker) when p=q=1. Under the MTT framework, some of the tasks (particle representations) are often more closely related and more likely to share common relevant covariates than other tasks. Therefore, we extend the MTT framework to take into account pairwise structural correlations between particles (e.g. spatial smoothness of representation) and denote the novel framework as S-MTT. The problem of learning the regularized sparse representation in MTT and S-MTT can be solved efficiently using an Accelerated Proximal Gradient (APG) method that yields a sequence of closed form updates. As such, S-MTT and MTT are computationally attractive. We test our proposed approach on challenging sequences involving heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that S-MTT is much better than MTT, and both methods consistently outperform state-of-the-art trackers. © 2012 Springer Science+Business Media New York.en
dc.description.sponsorshipThis study is supported by the research grant for the Human Sixth Sense Programme at the Advanced Digital Sciences Center from Singapore's Agency for Science, Technology and Research (A*STAR).en
dc.publisherSpringer Natureen
dc.subjectGraphen
dc.subjectMulti-task learningen
dc.subjectParticle filteren
dc.subjectSparse representationen
dc.subjectStructureen
dc.subjectVisual trackingen
dc.titleRobust visual tracking via structured multi-task sparse learningen
dc.typeArticleen
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Divisionen
dc.contributor.departmentElectrical Engineering Programen
dc.contributor.departmentVisual Computing Center (VCC)en
dc.contributor.departmentVCC Analytics Research Groupen
dc.identifier.journalInternational Journal of Computer Visionen
dc.contributor.institutionAdvanced Digital Sciences Center (ADSC), 08-10 Connexis North Tower, 1 Fusionopolis Way, Singapore 138632, Singaporeen
dc.contributor.institutionDepartment of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576, Singaporeen
dc.contributor.institutionDepartment of Electrical and Computer Engineering, Beckman Institute, University of Illinois at Urbana-Champaign, 405 N. Mathews Ave., Urbana IL 61801, United Statesen
kaust.authorGhanem, Bernarden
All Items in KAUST are protected by copyright, with all rights reserved, unless otherwise indicated.