### Recent Submissions

• #### Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting

(arXiv, 2021-04-19) [Preprint]
Soccer broadcast video understanding has been drawing a lot of attention in recent years within data scientists and industrial companies. This is mainly due to the lucrative potential unlocked by effective deep learning techniques developed in the field of computer vision. In this work, we focus on the topic of camera calibration and on its current limitations for the scientific community. More precisely, we tackle the absence of a large-scale calibration dataset and of a public calibration network trained on such a dataset. Specifically, we distill a powerful commercial calibration tool in a recent neural network architecture on the large-scale SoccerNet dataset, composed of untrimmed broadcast videos of 500 soccer games. We further release our distilled network, and leverage it to provide 3 ways of representing the calibration results along with player localization. Finally, we exploit those representations within the current best architecture for the action spotting task of SoccerNet-v2, and achieve new state-of-the-art performances.
• #### Random Reshuffling with Variance Reduction: New Analysis and Better Rates

(arXiv, 2021-04-19) [Preprint]
Virtually all state-of-the-art methods for training supervised machine learning models are variants of SGD enhanced with a number of additional tricks, such as minibatching, momentum, and adaptive stepsizes. One of the tricks that works so well in practice that it is used as default in virtually all widely used machine learning software is {\em random reshuffling (RR)}. However, the practical benefits of RR have until very recently been eluding attempts at being satisfactorily explained using theory. Motivated by recent development due to Mishchenko, Khaled and Richt\'{a}rik (2020), in this work we provide the first analysis of SVRG under Random Reshuffling (RR-SVRG) for general finite-sum problems. First, we show that RR-SVRG converges linearly with the rate $\mathcal{O}(\kappa^{3/2})$ in the strongly-convex case, and can be improved further to $\mathcal{O}(\kappa)$ in the big data regime (when $n > \mathcal{O}(\kappa)$), where $\kappa$ is the condition number. This improves upon the previous best rate $\mathcal{O}(\kappa^2)$ known for a variance reduced RR method in the strongly-convex case due to Ying, Yuan and Sayed (2020). Second, we obtain the first sublinear rate for general convex problems. Third, we establish similar fast rates for Cyclic-SVRG and Shuffle-Once-SVRG. Finally, we develop and analyze a more general variance reduction scheme for RR, which allows for less frequent updates of the control variate. We corroborate our theoretical results with suitably chosen experiments on synthetic and real datasets.
• #### Transverse Electric Lasing at a Record Short Wavelength 244.63 nm from GaN Quantum Wells with Weak Exciton Localization

(ACS Photonics, American Chemical Society (ACS), 2021-04-16) [Article]
We have demonstrated a record short wavelength lasing at 244.63 nm with TE dominant polarization from GaN quantum wells (QWs) at room temperature (RT). The optical threshold of 310 kW/cm2 is comparable to state-of-the-art AlGaN QW lasers at similar wavelengths. The sample was grown on the AlN/sapphire template pesudomorphically. X-ray diffraction (XRD) shows unambiguous higher-order satellite peaks indicating a sharp interface amid the active region. The excitonic localization was revealed and studied by the photoluminescence (PL) and time-resolved PL (TRPL) spectroscopy at temperatures ranging from 15 K to RT. At 15 K, the multiple-component PL decay curves with the decay time varying from 62.6 to 2.77 ns at different energies confirmed the localized excitons. The peak energy of the temperature-dependent PL spectra exhibited the “S-shape” behavior; and the weak exciton localization with a small localization energy of 14.3 meV was observed. Therefore, even in the low temperature region, the escape possibility of excitons increased as the temperature rose. As a result, the fwhm of the emission spectra changed significantly when the temperature was below 150 K. Above 150 K, the PL decay shape changed from the two-component exponential decay to the single exponential decay, indicating complete delocalization of excitons. The work demonstrates the weak localization and thus smooth interface in the GaN/AlN active region, which are desirable for DUV lasers operating at RT.
• #### SeedQuant: A deep learning-based tool for assessing stimulant and inhibitor activity on root parasitic seeds.

(Plant physiology, Oxford University Press (OUP), 2021-04-15) [Article]
Witchweeds (Striga spp.) and broomrapes (Orobanchaceae and Phelipanche spp.) are root parasitic plants that infest many crops in warm and temperate zones, causing enormous yield losses and endangering global food security. Seeds of these obligate parasites require rhizospheric, host-released stimulants to germinate, which opens up possibilities for controlling them by applying specific germination inhibitors or synthetic stimulants that induce lethal germination in host's absence. To determine their effect on germination, root exudates or synthetic stimulants/inhibitors are usually applied to parasitic seeds in in vitro bioassays, followed by assessment of germination ratios. Although these protocols are very sensitive, the germination recording process is laborious, representing a challenge for researchers and impeding high-throughput screens. Here, we developed an automatic seed census tool to count and discriminate germinated from non-germinated seeds. We combined deep learning, a powerful data-driven framework that can accelerate the procedure and increase its accuracy, for object detection with computer vision latest development based on the Faster R-CNN algorithm. Our method showed an accuracy of 94% in counting seeds of Striga hermonthica and reduced the required time from ˜5 minutes to 5 seconds per image. Our proposed software, SeedQuant, will be of great help for seed germination bioassays and enable high-throughput screening for germination stimulants/inhibitors. ​SeedQuant is an open-source software that can be further trained to count different types of seeds for research purposes.
• #### Multi-index ensemble Kalman filtering

(arXiv, 2021-04-15) [Preprint]
In this work we marry multi-index Monte Carlo with ensemble Kalman filtering (EnKF) to produce the multi-index EnKF method (MIEnKF). The MIEnKF method is based on independent samples of four-coupled EnKF estimators on a multi-index hierarchy of resolution levels, and it may be viewed as an extension of the multilevel EnKF (MLEnKF) method developed by the same authors in 2020. Multi-index here refers to a two-index method, consisting of a hierarchy of EnKF estimators that are coupled in two degrees of freedom: time discretization and ensemble size. Under certain assumptions, the MIEnKF method is proven to be more tractable than EnKF and MLEnKF, and this is also verified in numerical examples.
• #### Optimized Runge-Kutta Methods with Automatic Step Size Control for Compressible Computational Fluid Dynamics

(arXiv, 2021-04-14) [Preprint]
We develop error-control based time integration algorithms for compressible fluid dynamics (CFD) applications and show that they are efficient and robust in both the accuracy-limited and stability-limited regime. Focusing on discontinuous spectral element semidiscretizations, we design new controllers for existing methods and for some new embedded Runge-Kutta pairs. We demonstrate the importance of choosing adequate controller parameters and provide a means to obtain these in practice. We compare a wide range of error-control-based methods, along with the common approach in which step size control is based on the Courant-Friedrichs-Lewy (CFL) number. The optimized methods give improved performance and naturally adopt a step size close to the maximum stable CFL number at loose tolerances, while additionally providing control of the temporal error at tighter tolerances. The numerical examples include challenging industrial CFD applications.
• #### Aligning Latent and Image Spaces to Connect the Unconnectable

(arXiv, 2021-04-14) [Preprint]
In this work, we develop a method to generate infinite high-resolution images with diverse and complex content. It is based on a perfectly equivariant generator with synchronous interpolations in the image and latent spaces. Latent codes, when sampled, are positioned on the coordinate grid, and each pixel is computed from an interpolation of the nearby style codes. We modify the AdaIN mechanism to work in such a setup and train the generator in an adversarial setting to produce images positioned between any two latent vectors. At test time, this allows for generating complex and diverse infinite images and connecting any two unrelated scenes into a single arbitrarily large panorama. Apart from that, we introduce LHQ: a new dataset of \lhqsize high-resolution nature landscapes. We test the approach on LHQ, LSUN Tower and LSUN Bridge and outperform the baselines by at least 4 times in terms of quality and diversity of the produced infinite images. The project page is located at https://universome.github.io/alis.
• #### Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts

(arXiv, 2021-04-14) [Preprint]
• #### Toward Improving the Internet of Things: Quality of Service and Fault Tolerance Perspectives

(2021-04-13) [Dissertation]
Committee members: Alouini, Mohamed-Slim; Zhang, Xiangliang; Bessani, Alysson
The Internet of Things (IoT) is a technology aimed at developing a global network of machines and devices that can interact and communicate with each other. Supporting IoT, therefore, requires revisiting the Internet's best e ort service model and reviewing its complex communication patterns. In this dissertation, we explore the unique characteristics of IoT tra c and examine IoT systems. Our work is motivated by the new capabilities o ered by modern Software De ned Networks (SDN) and blockchain technology. We evaluate IoT Quality of Service (QoS) in traditional networking. We obtain mathematical expressions to calculate end-to-end delay, and dropping. Our results provide insight into the advantages of an intelligent edge serving as a detection mechanism. Subsequently, we propose SADIQ, SDN-based Application-aware Dynamic Internet of things QoS. SADIQ provides context-driven QoS for IoT applications by allowing applications to express their requirements using a high-level SQL-like policy language. Our results show that SADIQ improves the percentage of regions with an error in their reported temperature for the Weather Signal application up to 45 times; and it improves the percentage of incorrect parking statuses for regions with high occupancy for the Smart Parking application up to 30 times under the same network conditions and drop rates. Despite centralization and the control of data, IoT systems are not safe from cyber-crime, privacy issues, and security breaches. Therefore, we explore blockchain technology. In the context of IoT, Byzantine fault tolerance-based consensus protocols are used. However, the blockchain consensus layer contributes to the most remarkable performance overhead especially for IoT applications subject to maximum delay constraints. In order to capture the unique requirements of the IoT, consensus mechanisms and block formation need to be redesigned. To this end, we propose Synopsis, a novel hierarchical blockchain system. Synopsis introduces a wireless-optimized Byzantine chain replication protocol and a new probabilistic data structure. The results show that Synopsis successfully reduces the memory footprint from Megabytes to a few Kilobytes with an improvement of 1000 times. Synopsis also enables reductions in message complexity and commitment delay of 85% and 99.4%, respectively.
• #### Significant variants of type 2 diabetes in the Arabian Region through an Integration of exome databases.

(PloS one, Public Library of Science (PLoS), 2021-04-13) [Article]
Type 2 diabetes (T2D) is a major global health issue, and it has also become one of the major diseases in Arab countries. In addition to the exome databases that have already been established, whole exome sequencing data for the Greater Middle East are now available. To elucidate the genetic features of T2D in the Arabian Peninsula, we integrated two exome databases (gnomAD exome and the Greater Middle East Variome Project) with clinical information from the ClinVar. After the integration, we obtained 18 single nucleotide polymorphisms and found two statistically and clinically significant variants in two genes, SLC30A8 rs13266634 and KCNJ11 rs5219. Interestingly, the two genes are linked to the uptake of the metals, Zn and K respectively, which indicating the regional features of the genetic variants. The frequency of the risk allele of rs13266634 among individuals in the Arabian Peninsula was higher than among individuals in other regions. On the other hand, the frequency of the risk allele of rs5219 in the Arabian Peninsula was lower than that in other regions. We identified and characterized T2D-related variants that show unique tendencies in the Arabian Peninsula. Our analyses contribute to and provide guidance for the clinical research of T2D in the Arabian Peninsula.
• #### Improved characterisation of clinical text through ontology-based vocabulary expansion.

(Journal of biomedical semantics, Springer Nature, 2021-04-13) [Article]
BackgroundBiomedical ontologies contain a wealth of metadata that constitutes a fundamental infrastructural resource for text mining. For several reasons, redundancies exist in the ontology ecosystem, which lead to the same entities being described by several concepts in the same or similar contexts across several ontologies. While these concepts describe the same entities, they contain different sets of complementary metadata. Linking these definitions to make use of their combined metadata could lead to improved performance in ontology-based information retrieval, extraction, and analysis tasks.ResultsWe develop and present an algorithm that expands the set of labels associated with an ontology class using a combination of strict lexical matching and cross-ontology reasoner-enabled equivalency queries. Across all disease terms in the Disease Ontology, the approach found 51,362 additional labels, more than tripling the number defined by the ontology itself. Manual validation by a clinical expert on a random sampling of expanded synonyms over the Human Phenotype Ontology yielded a precision of 0.912. Furthermore, we found that annotating patient visits in MIMIC-III with an extended set of Disease Ontology labels led to semantic similarity score derived from those labels being a significantly better predictor of matching first diagnosis, with a mean average precision of 0.88 for the unexpanded set of annotations, and 0.913 for the expanded set.ConclusionsInter-ontology synonym expansion can lead to a vast increase in the scale of vocabulary available for text mining applications. While the accuracy of the extended vocabulary is not perfect, it nevertheless led to a significantly improved ontology-based characterisation of patients from text in one setting. Furthermore, where run-on error is not acceptable, the technique can be used to provide candidate synonyms which can be checked by a domain expert.
• #### Big Communications: Connect the Unconnected

(arXiv, 2021-04-13) [Preprint]
In this article, we present the analysis of the digital divide to illustrate the unfair access to the benefits brought by information and communications technology (ICT) over the globe and provide our solution termed big communications (BigCom) to close the digital divide and democratize the benefits of ICT. To facilitate the implementation of BigCom, we give a complete framework of BigCom considering both technological and non-technological factors as well as a set of suggestions for content providers, mobile network operators, and governments. By implementing BigCom, we aim to connect the last four billion unconnected people living in far-flung and underdeveloped areas and achieve the goal of global and ubiquitous connectivity for everyone in the 6G era.
• #### Investigation of InGaN-based red/green micro-light-emitting diodes

(Optics Letters, The Optical Society, 2021-04-12) [Article]
We investigated the performance of InGaN-based red/green micro-light-emitting diodes (µLEDs) ranging from 98 × 98 µm2 to 17 × 17 µm. The average forward voltage at 10 A/cm2 was independent of the dimension of µLEDs. Red µLEDs exhibited a larger blueshift of the peak wavelength (∼35 nm) and broader full-width at half maximum (≥50 nm) at 2−50 A/cm2 compared to green µLEDs. We demonstrated that 47 × 47 µm2 red µLEDs had an on-wafer external quantum efficiency of 0.36% at the peak wavelength of 626 nm, close to the red primary color defined in the recommendation 2020 standard.
• #### Wide-field-of-view optical detectors using fused fiber-optic tapers

(Optics Letters, The Optical Society, 2021-04-12) [Article]
Photodetectors used in wireless applications suffer from a trade-off between their response speeds and their active areas, which limits the received signal-to-noise ratio (SNR). Conventional light-focusing elements used to improve the SNR narrow the field of view (FOV). Herein, we demonstrate a versatile imaging light-focusing element featuring a wide FOV and high optical gain using fused fiber-optic tapers. To verify the practicality of the proposed design, we demonstrated and tested a wide-FOV optical detector for optical wireless communication that can be used for wavelengths ranging from the visible-light band to the near infrared. The proposed detector offers improvements over luminescent wide-FOV detectors, including higher efficiency, a broader modulation bandwidth, and indefinite stability.
• #### Snapshot space-time holographic three-dimensional particle tracking velocimetry

(Laser & Photonics Reviews, Wiley-VCH, 2021-04-12) [Article]
Digital inline holography is an amazingly simple and effective approach for three-dimensional imaging, to which particle tracking velocimetry is of particular interest. Conventional digital holographic particle tracking velocimetry techniques are computationally separated in particle and flow reconstruction, plus the expensive computations. Usually, the particle volumes are recovered firstly, from which fluid flows are computed. Without iterative reconstructions, This sequential spacetime process lacks accuracy. This paper presents a joint optimization framework for digital holographic particle tracking velocimetry: particle volumes and fluid flows are reconstructed jointly in a higher space-time dimension, enabling faster convergence and better reconstruction quality of both fluid flow and particle volumes within a few minutes on modern GPUs. Synthetic and experimental results are presented to show the efficiency of the proposed technique.
• #### Combined effects of hydrometeorological hazards and urbanisation on dengue risk in Brazil: a spatiotemporal modelling study.

(The Lancet. Planetary health, Elsevier BV, 2021-04-11) [Article]
Temperature and rainfall patterns are known to influence seasonal patterns of dengue transmission. However, the effect of severe drought and extremely wet conditions on the timing and intensity of dengue epidemics is poorly understood. In this study, we aimed to quantify the non-linear and delayed effects of extreme hydrometeorological hazards on dengue risk by level of urbanisation in Brazil using a spatiotemporal model. We combined distributed lag non-linear models with a spatiotemporal Bayesian hierarchical model framework to determine the exposure-lag-response association between the relative risk (RR) of dengue and a drought severity index. We fit the model to monthly dengue case data for the 558 microregions of Brazil between January, 2001, and January, 2019, accounting for unobserved confounding factors, spatial autocorrelation, seasonality, and interannual variability. We assessed the variation in RR by level of urbanisation through an interaction between the drought severity index and urbanisation. We also assessed the effect of hydrometeorological hazards on dengue risk in areas with a high frequency of water supply shortages. The dataset included 12 895 293 dengue cases reported between 2001 and 2019 in Brazil. Overall, the risk of dengue increased between 0-3 months after extremely wet conditions (maximum RR at 1 month lag 1·56 [95% CI 1·41-1·73]) and 3-5 months after drought conditions (maximum RR at 4 months lag 1·43 [1·22-1·67]). Including a linear interaction between the drought severity index and level of urbanisation improved the model fit and showed the risk of dengue was higher in more rural areas than highly urbanised areas during extremely wet conditions (maximum RR 1·77 [1·32-2·37] at 0 months lag vs maximum RR 1·58 [1·39-1·81] at 2 months lag), but higher in highly urbanised areas than rural areas after extreme drought (maximum RR 1·60 [1·33-1·92] vs 1·15 [1·08-1·22], both at 4 months lag). We also found the dengue risk following extreme drought was higher in areas that had a higher frequency of water supply shortages. Wet conditions and extreme drought can increase the risk of dengue with different delays. The risk associated with extremely wet conditions was higher in more rural areas and the risk associated with extreme drought was exacerbated in highly urbanised areas, which have water shortages and intermittent water supply during droughts. These findings have implications for targeting mosquito control activities in poorly serviced urban areas, not only during the wet and warm season, but also during drought periods. Royal Society, Medical Research Council, Wellcome Trust, National Institutes of Health, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro, and Conselho Nacional de Desenvolvimento Científico e Tecnológico. For the Portuguese translation of the abstract see Supplementary Materials section.
• #### A class of high-order weighted compact central schemes for solving hyperbolic conservation laws

(arXiv, 2021-04-09) [Preprint]
We propose a class of weighted compact central (WCC) schemes for solving hyperbolic conservation laws. The linear version can be considered as a high-order extension of the central Lax-Friedrichs (LxF) scheme and the central conservation element and solution element (CESE) scheme. On every cell, the solution is approximated by a Pth order polynomial of which all the DOFs are stored and updated separately. The cell average is updated by a classical finite volume scheme which is constructed based on space-time staggered meshes such that the fluxes are continuous across the interfaces of the adjacent control volumes and, therefore, the local Riemann problem is bypassed. The kth order spatial derivatives are updated by a central difference of (k-1)th order spatial derivatives at cell vertices. All the space-time information is calculated by the Cauchy-Kovalewski procedure. By doing so, the schemes are able to achieve arbitrarily uniform spacetime high order on a super-compact stencil with only one explicit time step. In order to capture discontinuities without spurious oscillations, a weighted essentially non-oscillatory (WENO) type limiter is tailor-made for the schemes. The limiter preserves the compactness and high order accuracy of the schemes. The accuracy, robustness, and efficiency of the schemes are verified by several numerical examples of scalar conservation laws and the compressible Euler equations.
• #### Massive Access in Media Modulation Based Massive Machine-Type Communications

(arXiv, 2021-04-08) [Preprint]
The massive machine-type communications (mMTC) paradigm based on media modulation in conjunction with massive MIMO base stations (BSs) is emerging as a viable solution to support the massive connectivity for the future Internet-of-Things, in which the inherent massive access at the BSs poses significant challenges for device activity and data detection (DADD). This paper considers the DADD problem for both uncoded and coded media modulation based mMTC with a slotted access frame structure, where the device activity remains unchanged within one frame. Specifically, due to the slotted access frame structure and the adopted media modulated symbols, the access signals exhibit a doubly structured sparsity in both the time domain and the modulation domain. Inspired by this, a doubly structured approximate message passing (DS-AMP) algorithm is proposed for reliable DADD in the uncoded case. Also, we derive the state evolution of the DS-AMP algorithm to theoretically characterize its performance. As for the coded case, we develop a bit-interleaved coded media modulation scheme and propose an iterative DS-AMP (IDS-AMP) algorithm based on successive inference cancellation (SIC), where the signal components associated with the detected active devices are successively subtracted to improve the data decoding performance. In addition, the channel estimation problem for media modulation based mMTC is discussed and an efficient data-aided channel state information (CSI) update strategy is developed to reduce the training overhead in block fading channels. Finally, simulation results and computational complexity analysis verify the superiority of the proposed algorithms in both uncoded and coded cases. Also, our results verify the validity of the proposed data-aided CSI update strategy.
• #### Distributed Resource Management in Downlink Cache-enabled Multi-cloud Radio Access Networks

(arXiv, 2021-04-08) [Preprint]
In the scope of beyond fifth generation (B5G) networks and the massive increase of data-hungry systems, the need of extending conventional single-cloud radio access networks (C-RAN) arises. A compound of several clouds, jointly managing inter-cloud and intra-cloud interference, constitutes a practical solution to cope with requirements of B5G networks. This paper considers a multi-cloud radio access network model (MC-RAN) where each cloud is connected to a distinct set of base stations (BSs) via limited capacity fronthaul links. The BSs are equipped with local cache storage and base-band processing capabilities, as a means to alleviate the fronthaul congestion problem. The paper then investigates the problem of jointly assigning users to clouds and determining their beamforming vectors so as to maximize the network-wide energy efficiency (EE) subject to fronthaul capacity, and transmit power constraints. This paper solves such a mixed discrete-continuous, non-convex optimization problem using fractional programming (FP) and successive inner-convex approximation (SICA) techniques to deal with the non-convexity of the continuous part of the problem, and $l_0$-norm approximation to account for the binary association part. A highlight of the proposed algorithm is its capability of being implemented in a distributed fashion across the network multiple clouds through a reasonable amount of information exchange. The numerical simulations illustrate the pronounced role the proposed algorithm plays in alleviating the interference of large-scale MC-RANs, especially in dense networks.
• #### On Telecommunication Service Imbalance and Infrastructure Resource Deployment

(arXiv, 2021-04-08) [Preprint]
The digital divide restricting the access of people living in developing areas to the benefits of modern information and communications technologies has become a major challenge and research focus. To well understand and finally bridge the digital divide, we first need to discover a proper measure to characterize and quantify the telecommunication service imbalance. In this regard, we propose a fine-grained and easy-to-compute imbalance index, aiming to quantitatively link the relation among telecommunication service imbalance, telecommunication infrastructure, and demographic distribution. The mathematically elegant and generic form of the imbalance index allows consistent analyses for heterogeneous scenarios and can be easily tailored to incorporate different telecommunication policies and application scenarios. Based on this index, we also propose an infrastructure resource deployment strategy by minimizing the average imbalance index of any geographical segment. Experimental results verify the effectiveness of the proposed imbalance index by showing a high degree of correlation to existing congeneric but coarse-grained measures and the superiority of the infrastructure resource deployment strategy.