Recent Submissions

  • Design and Performance Assessment of an Experimental Rig to Conduct Material Nitridation Studies at Extreme Ammonia Conditions

    Laws, Nicole; Campuzano, Felipe; Dawood, Ziad; Roberts, William L. (American Society of Mechanical Engineers, 2023-09-28) [Conference Paper]
    This work presents the design and assessment of an experimental rig intended to carry out material sustainability studies in ammonia (NH3) environments. The designed facility aims at furthering the understanding of the nitridation potential of NH3 under extreme conditions, particularly at both high pressure and high temperature, when interacting with varying materials of interest. Several key factors were considered during the design process, including a system capable of operating at high pressures (up to 30 bar) and temperatures (up to 800 °C), the possibility of maintaining a continuous NH3 flow to increase the nitridation potential of the atmosphere, the flexibility of accommodating between 60 to 70 metal samples, and the design of a suitable system to operate continuously for hundreds of hours (up to 1000 hours in total). A facility achieving those features involved the development and optimization of three main subsystems such as a liquefied NH3 feeding system, a high-pressure/temperature rig, and an abatement system. Altogether these guarantee safe and continuous operation for long periods of time at the desired experimental conditions. Finally, several tests were conducted to assess the reproducibility and stability of the designed facility, including temperature and pressure profiles, and NH3 concentration gradients in the rig.
  • An Experimental Study of the Stability and Nearfield Structure of Oxyfuel Jet Flames at Elevated Pressures

    Bukar, Muhammad; Basnet, Suman; Wu, Bin; Magnotti, Gaetano (American Society of Mechanical Engineers, 2023-09-28) [Conference Paper]
    This study presents experimental results of the stability limit and the nearfield structure of oxy-methane jet flames at pressures ranging from 2–5 bar. The oxidizer used in this campaign consists of 40% O2 and 60% CO2. Two sets of cases were studied: one where pressure increase was achieved by keeping the fuel Reynolds number constant and the other where the velocity was kept constant while increasing pressure. Stability limits (lift-off velocity) are reported for various coflow velocities at different operating pressures. Natural flame luminosity imaging with a DSLR camera and combined CH* chemiluminescence using an ICCD (PIMAX 4) are used to characterize the nearfield structure of the flame. The CH* results were then processed to extract the attachment location, defined in terms of attachment height and radius. The study mainly investigates the effect of pressure on the flame attachment height and radius. The DSLR images complement the study with qualitative information on the flame appearance and sooting propensity. Results show that at constant Reynolds number, the attachment height decreases with pressure for all cases considered while the attachment radius increased with pressure increase. At constant velocity, however, both the attachment height and radius were observed to decrease with increased pressure.
  • Mobile Sub-diffusion Molecular Communication Channel

    Briantceva, Nadezhda; Chouhan, Lokendra; Parsani, Matteo; Alouini, Mohamed-Slim (ACM, 2023-09-12) [Conference Paper]
    This work considers the sub-diffusion motion of information-carrying molecules (IM) inside a molecular communication (MC) channel. We also explore the impacts of transmitter (Tx) and receiver (Rx) mobility. To do so, we obtain the closed-form expressions of the absorption probability (AP) and the first-passage-time density (FPTD).
  • Scaling the “Memory Wall” for Multi-Dimensional Seismic Processing with Algebraic Compression on Cerebras CS-2 Systems

    Ltaief, Hatem; Hong, Yuxi; Wilson, Leighton; Jacquelin, Mathias; Ravasi, Matteo; Keyes, David E. (ACM/IEEE, 2023-09-11) [Conference Paper]
    We exploit the high memory bandwidth of AIcustomized Cerebras CS-2 systems for seismic processing. By leveraging low-rank matrix approximation, we fit memoryhungry seismic applications onto memory-austere SRAM waferscale hardware, thus addressing a challenge arising in many wave-equation-based algorithms that rely on Multi-Dimensional Convolution (MDC) operators. Exploiting sparsity inherent in seismic data in the frequency domain, we implement embarrassingly parallel tile low-rank matrix-vector multiplications (TLRMVM), which account for most of the elapsed time in MDC operations, to successfully solve the Multi-Dimensional Deconvolution (MDD) inverse problem. By reducing memory footprint along with arithmetic complexity, we fit a standard seismic benchmark dataset into the small local memories of Cerebras processing elements. Deploying TLR-MVM execution onto 48 CS-2 systems in support of MDD gives a sustained memory bandwidth of 92.58PB/s on 35, 784, 000 processing elements, a significant milestone that highlights the capabilities of AIcustomized architectures to enable a new generation of seismic algorithms that will empower multiple technologies of our lowcarbon future.
  • UBIC-A Blockchain-Less Cryptocurrency

    Caprolu, Maurantonio; Bentafat, Elmahdi; Bakiras, Spiridon; Di Pietro, Roberto (IEEE, 2023-09-06) [Conference Paper]
    In this paper we propose UBIC, a novel blockchain-less architecture that preserves the main advantages of classic cryptocurrencies while avoiding their pitfalls. The proposed construction is general-that is, UBIC can be adopted at par with any other cryptocurrency-though UBIC also satisfies the requirements to support state-sponsored financial services, like Universal Basic Income and Central Bank Digital Currency. Indeed, UBIC stands for Universal Basic Income Coin, to highlight one of its most straightforward use-cases. One of the key features of UBIC is that every user participating in the protocol gets fair and equal access to the rewards, regardless of the available resources, e.g., computational power or financial stake. Moreover, by leveraging standard cryptographic techniques, such as homomorphic encryption and verifiable random functions, UBIC ensures full user privacy and trust in the network, while enjoying a highly scalable architecture. Our experimental results confirm the feasibility of the proposed architecture and demonstrate that UBIC is very efficient in terms of transaction verification time. To the best of our knowledge, this is the first blockchain-less cryptocurrency proposal. Other than being interesting on its own, and being particularly fit to support UBI and Central Bank Digital Currency, the architectural solutions and the technical choices discussed in this contribution have the potential to generate high impact and further research in the field.
  • Observer-based Power System Stabilizer for Multi-terminal DC Networks

    Bertozzi, Otavio; Chamorro, Harold R.; Rouzbehi, Kumars; Guel-Cortez, Adrian-Josue; Ahmed, Shehab (IEEE, 2023-09-04) [Conference Paper]
    Multi-terminal DC (MTDC) networks interconnect isolated systems, asynchronous areas, and renewable energy resources. However, ensuring stability and proper dynamic behavior of MTDC networks can be challenging. Power electronic converters offer increased control capabilities to achieve predictable and stable dynamics in the face of disturbances, faults, and sustained oscillations. The DC-side power system stabilizer (DC-PSS) is a supplementary controller that relies on optimal placement for maximized effectiveness. Observers can estimate the state of relevant system variables for the feedback controller without system-wide communication. This paper proposes an optimally-placed observer-based augmented DC-PSS for enhanced voltage stability in MTDC networks. The design procedure assumes no knowledge of the structure and parameters of the inner controllers of the terminal converters. We present a mathematical model and discuss the observer and control design procedures, and simulation results show the benefits and potential drawbacks of the proposed approach. Our observer-based DC-PSS relies on system-wide DC-bus voltage estimates from a single measurement, making it less vulnerable to communication delays and cyberattacks. Our work contributes to the field of supplementary controllers in MTDC networks and provides insights into future research directions toward stable and reliable DC power systems.
  • TENSOR: Lightweight BGP Non-Stop Routing

    Miao, Congcong; Xiao, Yunming; Canini, Marco; Dai, Ruiqiang; Zheng, Shengli; Wang, Jilong; Bu, Jiwu; Kuzmanovic, Aleksandar; Wang, Yachen (ACM, 2023-09) [Conference Paper]
    As the solitary inter-domain protocol, BGP plays an important role in today's Internet. Its failures threaten network stability and will usually result in large-scale packet losses. Thus, the non-stop routing (NSR) capability that protects inter-domain connectivity from being disrupted by various failures, is critical to any Autonomous System (AS) operator. Replicating the BGP and underlying TCP connection status is key to realizing NSR. But existing NSR solutions, which heavily rely on OS kernel modifications, have become impractical due to providers' adoption of virtualized network gateways for better scalability and manageability. In this paper, we tackle this problem by proposing TENSOR, which incorporates a novel kernel-modification-free replication design and lightweight architecture. More concretely, the kernel-modification-free replication design mitigates the reliance on OS kernel modification and hence allows the virtualization of the network gateway. Meanwhile, lightweight virtualization provides strong performance guarantees and improves system reliability. Moreover, TENSOR provides a solution to the split-brain problem that affects NSR solutions. Through extensive experiments, we show that TENSOR realizes NSR while bearing little overhead compared to open-source BGP implementations. Further, our two-year operational experience on a fleet of 400 servers controlling over 31,000 BGP peering connections demonstrates that TENSOR reduces the development, deployment, and maintenance costs significantly - at least by factors of 20, 5, and 10, respectively, while retaining the same SLA with the NSR-enabled routers.
  • Effects of Engine Speed on Prechamber-Assisted Combustion

    Palombi, Lucia; Sharma, Priybrat; Cenker, Emre; Magnotti, Gaetano (SAE International, 2023-08-28) [Conference Paper]
    Lean combustion technologies show promise for improving engine efficiency and reducing emissions. Among these technologies, prechamber-assisted combustion (PCC) is established as a reliable option for achieving lean or ultra-lean combustion. In this study, the effect of engine speed on PCC was investigated in a naturally aspirated heavy-duty optical engine: a comparison has been made between analytical performances and optical flame behavior. Bottom view natural flame luminosity (NFL) imaging was used to observe the combustion process. The prechamber was fueled with methane, while the main chamber was fueled with methanol. The engine speed was varied at 1000, 1100, and 1200 revolutions per minute (rpm). The combustion in the prechamber is not affected by changes in engine speed. However, the heat release rate (HRR) in the main chamber changed from two distinct stages with a faster first stage to more gradual and merged stages as the engine speed increased. NFL imaging revealed that lower mean piston speed allowed for longer free jet propagation inside the combustion chamber resulting in faster and stronger HRR stages. At higher speeds, the jet-piston interaction started earlier and was dispersed in radial directions, resulting in a relatively prolonged HRR. Finally, the study emphasizes the importance of prechamber jet and piston interaction in shaping HRR.
  • Smart Diaper Embedded with Fully Printed Sensors for Wireless Healthcare and Monitoring

    Ali, Shawkat; Khan, Arshad; Bermak, Amine (IEEE, 2023-08-28) [Conference Paper]
    In this research, a real-time healthcare and monitoring wireless system is developed to monitor vital signs of a wearer i.e., temperature, respiration rate, and wetting volume on a smartphone. The smart diaper consists of two main units, a detachable electronic module and a disposable printed biosensors patch deployed on a commercial diaper. Custom sensors array was designed and fabricated in a single patch through printed electronics technology by using biocompatible materials on a thermal transfer substrate. The sensors patch was designed in such a way to accurately collect the vital signs data by placing each sensor at appropriate site. Individual sensors were characterized for their electrical and mechanical properties. The overall system was tested, and data was wirelessly transmitted to a smartphone via Bluetooth. The main purpose of the proposed smart diaper is to minimize the effort of caregiver and avoid infection, rashes, and discomfort. We believe this work will open a new window for the printed electronics systems for special needs, elderly, and infant healthcare.
  • Electromagnetic Exposure Aware RSMA for Multiuser Massive MIMO Uplink

    Jiang, Hanyu; You, Li; Elzanaty, Ahmed; Wang, Jue; Wang, Wenjin; Gao, Xiqi; Alouini, Mohamed-Slim (IEEE, 2023-08-25) [Conference Paper]
    This paper investigates the rate-splitting multiple access (RSMA) transmission design for multiuser multiple-input multiple-output (MIMO) uplink with EM exposure constraints. Specifically, the transmit covariance matrices and decoding order are optimized at the users and BS, respectively, via utilizing statistical channel state information (CSI) to maximize the energy efficiency (EE). The problem is formulated as non-convex mixed integer program, which is divided into the equivalent two subproblems. We first handle the inner problem by adopting the minorization-maximization (MM) and Dinkelbachs methods. Then, a modified water-filling scheme is proposed to obtain the transmit covariance matrices with fixed decoding permutation. For the outer problem, a greedy approach is proposed to obtain the decoding permutation. Numerical results verify the effectiveness of the proposed EM exposure aware EE maximization scheme for uplink RSMA.
  • Unsupervised Volumetric Animation

    Siarohin, Aliaksandr; Menapace, Willi; Skorokhodov, Ivan; Olszewski, Kyle; Ren, Jian; Lee, Hsin-Ying; Chai, Menglei; Tulyakov, Sergey (IEEE, 2023-08-22) [Conference Paper]
    We propose a novel approach for unsupervised 3D animation of non-rigid deformable objects. Our method learns the 3D structure and dynamics of objects solely from single-view RGB videos, and can decompose them into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable PnP algorithm, our model learns the underlying object geometry and parts decomposition in an entirely unsupervised manner. This allows it to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. We primarily evaluate the framework on two video datasets: VoxCeleb 256 2 and TEDXPeople 256 2 . In addition, on the Cats 256 2 image dataset, we show it even learns compelling 3D geometry from still images. Finally, we show our model can obtain animatable 3D objects from a single or few images.
  • Adaptive Capacitive Power Transfer System utilizing Switch-Controlled Capacitor and DC-DC Converter

    Mostafa, Tarek Mahmoud Atia; Khater, Moutazbellah Abdelkhaleq; Ahmed, Shehab (IEEE, 2023-08-22) [Conference Paper]
    Capacitive power transfer has received a lot of attention as one of the new wireless power transfer technologies due to the flexibility of the coupling structure, low standing losses, and low Electromagnetic Interference. However, the ability to deliver power degrades significantly with load variations and coupling misalignment. This degradation represents a serious issue for the emerging technology. In this paper, a switch-controlled capacitor (SCC) and a buck-boost DC-DC converter have been added on the transmitter side and the receiving side of the conventional system, respectively. The SCC provides a mean of compensation for the coupling variations while a variable duty cycle DC-DC converter is employed to mitigate the load variation effect. A 26 W system is modeled and simulated, utilizing actual component models for verification. Compared to a typical system, the proposed system could achieve constant output power against variations of ±75% in the coupling interface value and a wide range of load variations from 5 Ω to 1 kΩ.
  • AdaptiveMix: Improving GAN Training via Feature Space Shrinkage

    Liu, Haozhe; Zhang, Wentian; Li, Bing; Wu, Haoqian; He, Nanjun; Huang, Yawen; Li, Yuexiang; Ghanem, Bernard; Zheng, Yefeng (IEEE, 2023-08-22) [Conference Paper]
    Due to the outstanding capability for data generation, Generative Adversarial Networks (GANs) have attracted considerable attention in unsupervised learning. However, training GANs is difficult, since the training distribution is dynamic for the discriminator, leading to unstable image representation. In this paper, we address the problem of training GANs from a novel perspective, i.e., robust image classification. Motivated by studies on robust image representation, we propose a simple yet effective module, namely AdaptiveMix, for GANs, which shrinks the regions of training data in the image representation space of the discriminator. Considering it is intractable to directly bound feature space, we propose to construct hard samples and narrow down the feature distance between hard and easy samples. The hard samples are constructed by mixing a pair of training images. We evaluate the effectiveness of our AdaptiveMix with widely-used and state-of-the-art GAN architectures. The evaluation results demonstrate that our AdaptiveMix can facilitate the training of GANs and effectively improve the image quality of generated samples. We also show that our AdaptiveMix can be further applied to image classification and Out-Of-Distribution (OOD) detection tasks, by equipping it with state-of-the-art methods. Extensive experiments on seven publicly available datasets show that our method effectively boosts the performance of baselines.
  • PIVOT: Prompting for Video Continual Learning

    Villa, Andrés; Alcázar, Juan León; Alfarra, Motasem; Alhamoud, Kumail; Hurtado, Julio; Heilbron, Fabian Caba; Soto, Alvaro; Ghanem, Bernard (IEEE, 2023-08-22) [Conference Paper]
    Modern machine learning pipelines are limited due to data availability, storage quotas, privacy regulations, and expensive annotation processes. These constraints make it difficult or impossible to train and update large-scale models on such dynamic annotated sets. Continual learning directly approaches this problem, with the ultimate goal of devising methods where a deep neural network effectively learns relevant patterns for new (unseen) classes, without significantly altering its performance on previously learned ones. In this paper, we address the problem of continual learning for video data. We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain, thereby reducing the number of trainable parameters and the associated forgetting. Unlike previous methods, ours is the first approach that effectively uses prompting mechanisms for continual learning without any in-domain pre-training. Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
  • NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation

    Wu, Haoqian; Chen, Keyu; Liu, Haozhe; Zhuge, Mingchen; li, bing; Qiao, Ruizhi; Shu, Xiujun; Gan, Bei; Xu, Liangsheng; Ren, Bo; Xu, Mengmeng; Zhang, Wentian; Ramachandra, Raghavendra; Lin, Chia-Wen; Ghanem, Bernard (IEEE, 2023-08-22) [Conference Paper]
    Temporal video segmentation is the get-to- go automatic video analysis, which decomposes a long-form video into smaller components for the following-up understanding tasks. Recent works have studied several levels of granularity to segment a video, such as shot, event, and scene. Those segmentations can help compare the semantics in the corresponding scales, but lack a wider view of larger temporal spans, especially when the video is complex and structured. Therefore, we present two abstractive levels of temporal segmentations and study their hierarchy to the existing fine-grained levels. Accordingly, we collect NewsNet, the largest news video dataset consisting of 1,000 videos in over 900 hours, associated with several tasks for hierarchical temporal video segmentation. Each news video is a collection of stories on different topics, represented as aligned audio, visual, and textual data, along with extensive frame-wise annotations in four granularities. We assert that the study on NewsNet can advance the understanding of complex structured video and benefit more areas such as short-video creation, personalized advertisement, digital instruction, and education.
  • Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

    Zhao, Chen; Liu, Shuming; Mangalam, Karttikeya; Ghanem, Bernard (IEEE, 2023-08-22) [Conference Paper]
    Temporal action localization (TAL) requires long-form reasoning to predict actions of various durations and complex content. Given limited GPU memory, training TAL end to end (i.e., from videos to predictions) on long videos is a significant challenge. Most methods can only train on pre-extracted features without optimizing them for the localization problem, consequently limiting localization performance. In this work, to extend the potential in TAL networks, we propose a novel end-to-end method Re 2 TAL, which rewires pretrained video backbones for reversible TAL. Re 2 TAL builds a backbone with reversible modules, where the input can be recovered from the output such that the bulky intermediate activations can be cleared from memory during training. Instead of designing one single type of reversible module, we propose a network rewiring mechanism, to transform any module with a residual connection to a reversible module without changing any parameters. This provides two benefits: (1) a large variety of reversible networks are easily obtained from existing and even future model designs, and (2) the reversible models require much less training effort as they reuse the pre-trained parameters of their original non-reversible versions. Re 2 TAL, only using the RGB modality, reaches 37.01% average mAP on ActivityNet-v1.3, a new state-of-the-art record, and mAP 64.9% at tIoU=0.5 on THUMOS-14, outperforming all other RGB-only methods.
  • PMR: Prototypical Modal Rebalance for Multimodal Learning

    Fan, Yunfeng; Xu, Wenchao; Wang, Haozhao; Wang, Junxiao; Guo, Song (IEEE, 2023-08-22) [Conference Paper]
    Multimodal learning (MML) aims to jointly exploit the common priors of different modalities to compensate for their inherent limitations. However, existing MML methods often optimize a uniform objective for different modalities, leading to the notorious “modality imbalance” problem and counterproductive MML performance. To address the problem, some existing methods modulate the learning pace based on the fused modality, which is dominated by the better modality and eventually results in a limited improvement on the worse modal. To better exploit the features of multimodal, we propose Prototypical Modality Rebalance (PMR) to perform stimulation on the particular slow-learning modality without interference from other modalities. Specifically, we introduce the prototypes that represent general features for each class, to build the non-parametric classifiers for uni-modal performance evaluation. Then, we try to accelerate the slow-learning modality by enhancing its clustering toward prototypes. Furthermore, to alleviate the suppression from the dominant modality, we introduce a prototype-based entropy regularization term during the early training stage to prevent premature convergence. Besides, our method only relies on the representations of each modality and without restrictions from model structures and fusion methods, making it with great application potential for various scenarios.
  • DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

    Xu, Yinghao; Chai, Menglei; Shi, Zifan; Peng, Sida; Skorokhodov, Ivan; Siarohin, Aliaksandr; Yang, Ceyuan; Shen, Yujun; Lee, Hsin-Ying; Zhou, Bolei; Tulyakov, Sergey (IEEE, 2023-08-22) [Conference Paper]
    Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects. This work presents DisCoScene: a 3D-aware generative model for high-quality and controllable scene synthesis. The key ingredient of our method is a very abstract object-level representation (i.e., 3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. Moreover, it serves as an intuitive user control for scene editing. Based on such a prior, the proposed model spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with the global-local discrimination. Our model obtains the generation fidelity and editing flexibility of individual objects while being able to efficiently compose objects and the background into a complete scene. We demonstrate state-of-the-art performance on many scene datasets, including the challenging Waymo outdoor dataset.
  • Numerical assessment of the performance and emissions of a compact Wankel rotary engine applied as a range extender on the BMW i3 model

    Vorraro, Giovanni; Turner, James W. G. (JSAE/SAE, 2023-08-22) [Conference Paper]
    Amongst all the hybrid-vehicles technologies and layouts, range-extended electric vehicles are the ones with the real prospect to reduce the emissions produced by the thermal machine when in driving conditions, while assuring an adequate range for the common user. The BMW i3 represents one of the most successful series hybrid electric vehicles, having been on the market since 2013. Given the complexities of a hybrid layout employing both thermal and electrical machines, the range extender must have compactness and lightweight characteristics in addition to a suitable power output for the vehicle. Usually, standard 4- stroke small-displacement engines are employed for this application, with the BMW i3 employing a 2- cylinder range extender. More interestingly, a Wankel rotary engine can provide the same amount of mechanical power by reducing the weight and the volume of nearly a third to the equivalent 4-stroke engine. In this study a numerical assessment of the Advanced Innovative Engineering UK (AIE UK) 225CS Wankel rotary engine as a range extender for the BMW i3 was carried out. A full vehicle model of the BMW i3 was built in Siemens Simcenter Amesim 2021.2 to evaluate the behaviour of the aforementioned engine as a range extender. The engine sub-model used was a Mean Value Engine Model (MVEM) set up by implementing the experimental data collected during previous experimental campaign while the BMW i3 chassis sub-model was characterised by using the publicly available data from an Argonne National Laboratory benchmarking project (vehicle weight, front surface, drag coefficient, tires dimensions, etc.). Finally the model was tested over the standard Worldwide harmonized Light vehicles Test Procedure (WLTP) driving cycle in both Charge Depleting and Charge Sustaining modes.
  • Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization

    Xu, Mengmeng; Li, Yanghao; Fu, Cheng-Yang; Ghanem, Bernard; Xiang, Tao; Pérez-Rúa, Juan-Manuel (IEEE, 2023-08-22) [Conference Paper]
    This paper deals with the problem of localizing objects in image and video datasets from visual exemplars. In particular, we focus on the challenging problem of egocentric visual query localization. We first identify grave implicit biases in current query-conditioned model design and visual query datasets. Then, we directly tackle such biases at both frame and object set levels. Concretely, our method solves these issues by expanding limited annotations and dynamically dropping object proposals during training. Additionally, we propose a novel transformer-based module that allows for object-proposal set context to be considered while incorporating query information. We name our module Conditioned Contextual Transformer or CocoFormer. Our experiments show the proposed adaptations improve egocentric query detection, leading to a better visual query localization system in both 2D and 3D configurations. Thus, we can improve frame-level detection performance from 26.28% to 31.26% in AP, which correspondingly improves the VQ2D and VQ3D localization scores by significant margins. Our improved context-aware query object detector ranked first and second respectively in the VQ2D and VQ3D tasks in the 2nd Ego4D challenge. In addition to this, we showcase the relevance of our proposed model in the Few-Shot Detection (FSD) task, where we also achieve SOTA results.

View more