• Login
    View Item 
    •   Home
    • Research
    • Conference Papers
    • View Item
    •   Home
    • Research
    • Conference Papers
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of KAUSTCommunitiesIssue DateSubmit DateThis CollectionIssue DateSubmit Date

    My Account

    Login

    Quick Links

    Open Access PolicyORCID LibguidePlumX LibguideSubmit an Item

    Statistics

    Display statistics

    Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    action-search.pdf
    Size:
    2.296Mb
    Format:
    PDF
    Description:
    Accepted Manuscript
    Download
    Thumbnail
    Name:
    action-search-supplementary.pdf
    Size:
    482.7Kb
    Format:
    PDF
    Description:
    Supplemental files
    Download
    Type
    Conference Paper
    Authors
    Alwassel, Humam cc
    Caba Heilbron, Fabian
    Ghanem, Bernard cc
    KAUST Department
    Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
    Computer Science Program
    Electrical Engineering Program
    Visual Computing Center (VCC)
    KAUST Grant Number
    OSR-CRG2017-3405
    Date
    2018-10-05
    Online Publication Date
    2018-10-05
    Print Publication Date
    2018
    Permanent link to this record
    http://hdl.handle.net/10754/630233
    
    Metadata
    Show full item record
    Abstract
    State-of-the-art temporal action detectors inefficiently search the entire video for specific actions. Despite the encouraging progress these methods achieve, it is crucial to design automated approaches that only explore parts of the video which are the most relevant to the actions being searched for. To address this need, we propose the new problem of action spotting in video, which we define as finding a specific action in a video while observing a small portion of that video. Inspired by the observation that humans are extremely efficient and accurate in spotting and finding action instances in video, we propose Action Search, a novel Recurrent Neural Network approach that mimics the way humans spot actions. Moreover, to address the absence of data recording the behavior of human annotators, we put forward the Human Searches dataset, which compiles the search sequences employed by human annotators spotting actions in the AVA and THUMOS14 datasets. We consider temporal action localization as an application of the action spotting problem. Experiments on the THUMOS14 dataset reveal that our model is not only able to explore the video efficiently (observing on average 17.3% of the video) but it also accurately finds human activities with 30.8% mAP.
    Citation
    Alwassel H, Caba Heilbron F, Ghanem B (2018) Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization. Lecture Notes in Computer Science: 253–269. Available: http://dx.doi.org/10.1007/978-3-030-01240-3_16.
    Sponsors
    This publication is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. OSR-CRG2017-3405.
    Publisher
    Springer Nature
    Journal
    Lecture Notes in Computer Science
    Conference/Event name
    15th European Conference on Computer Vision, ECCV 2018
    DOI
    10.1007/978-3-030-01240-3_16
    Additional Links
    https://link.springer.com/chapter/10.1007%2F978-3-030-01240-3_16
    ae974a485f413a2113503eed53cd6c53
    10.1007/978-3-030-01240-3_16
    Scopus Count
    Collections
    Conference Papers; Computer Science Program; Electrical Engineering Program; Visual Computing Center (VCC); Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division

    entitlement

     

    Related items

    Showing items related by title, author, creator and subject.

    • Thumbnail

      DAPs: Deep Action Proposals for Action Understanding

      Escorcia, Victor; Caba Heilbron, Fabian; Niebles, Juan Carlos; Ghanem, Bernard (Lecture Notes in Computer Science, Springer Nature, 2016-09-17) [Conference Paper]
      Object proposals have contributed significantly to recent advances in object understanding in images. Inspired by the success of this approach, we introduce Deep Action Proposals (DAPs), an effective and efficient algorithm for generating temporal action proposals from long videos. We show how to take advantage of the vast capacity of deep learning models and memory cells to retrieve from untrimmed videos temporal segments, which are likely to contain actions. A comprehensive evaluation indicates that our approach outperforms previous work on a large scale action benchmark, runs at 134 FPS making it practical for large-scale scenarios, and exhibits an appealing ability to generalize, i.e. to retrieve good quality temporal proposals of actions unseen in training.
    • Thumbnail

      Trajectory-based Fisher kernel representation for action recognition in videos

      Atmosukarto, Indriyati; Ghanem, Bernard; Ahuja, Narendra (Institute of Electrical and Electronics Engineers (IEEE), 2012) [Conference Paper]
      Action recognition is an important computer vision problem that has many applications including video indexing and retrieval, event detection, and video summarization. In this paper, we propose to apply the Fisher kernel paradigm to action recognition. The Fisher kernel framework combines the strengths of generative and discriminative models. In this approach, given the trajectories extracted from a video and a generative Gaussian Mixture Model (GMM), we use the Fisher Kernel method to describe how much the GMM parameters are modified to best fit the video trajectories. We experiment in using the Fisher Kernel vector to create the video representation and to train an SVM classifier. We further extend our framework to select the most discriminative trajectories using a novel MIL-KNN framework. We compare the performance of our approach to the current state-of-the-art bag-of-features (BOF) approach on two benchmark datasets. Experimental results show that our proposed approach outperforms the state-of-the-art method [8] and that the selected discriminative trajectories are descriptive of the action class.
    • Thumbnail

      Learning a strong detector for action localization in videos

      Zhang, Yongqiang; Ding, Mingli; Bai, Yancheng; Liu, Dandan; Ghanem, Bernard (Pattern Recognition Letters, Elsevier BV, 2019-10-09) [Article]
      We address the problem of spatio-temporal action localization in videos in this paper. Current state-of-the-art methods for this challenging task rely on an object detector to localize actors at frame-level firstly, and then link or track the detections across time. Most of these methods commonly pay more attention to leveraging the temporal context of videos for action detection while ignoring the importance of the object detector itself. In this paper, we prove the importance of the object detector in the pipeline of action localization, and propose a strong object detector for better action localization in videos, which is based on the single shot multibox detector (SSD) framework. Different from SSD, we introduce an anchor refine branch at the end of the backbone network to refine the input anchors, and add a batch normalization layer before concatenating the intermediate feature maps at frame-level and after stacking feature maps at clip-level. The proposed strong detector have two contributions: (1) reducing the phenomenon of missing target objects at frame-level; (2) generating deformable anchor cuboids for modeling temporal dynamic actions. Extensive experiments on UCF-Sports, J-HMDB and UCF-101 validate our claims, and we outperform the previous state-of-the-art methods by a large margin in terms of frame-mAP and video-mAP, especially at a higher overlap threshold.
    DSpace software copyright © 2002-2021  DuraSpace
    Quick Guide | Contact Us | Send Feedback
    Open Repository is a service hosted by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items. For anonymous users the allowed maximum amount is 50 search results.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.