### Recent Submissions

• #### Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting

(arXiv, 2021-04-19) [Preprint]
Soccer broadcast video understanding has been drawing a lot of attention in recent years within data scientists and industrial companies. This is mainly due to the lucrative potential unlocked by effective deep learning techniques developed in the field of computer vision. In this work, we focus on the topic of camera calibration and on its current limitations for the scientific community. More precisely, we tackle the absence of a large-scale calibration dataset and of a public calibration network trained on such a dataset. Specifically, we distill a powerful commercial calibration tool in a recent neural network architecture on the large-scale SoccerNet dataset, composed of untrimmed broadcast videos of 500 soccer games. We further release our distilled network, and leverage it to provide 3 ways of representing the calibration results along with player localization. Finally, we exploit those representations within the current best architecture for the action spotting task of SoccerNet-v2, and achieve new state-of-the-art performances.
• #### Random Reshuffling with Variance Reduction: New Analysis and Better Rates

(arXiv, 2021-04-19) [Preprint]
Virtually all state-of-the-art methods for training supervised machine learning models are variants of SGD enhanced with a number of additional tricks, such as minibatching, momentum, and adaptive stepsizes. One of the tricks that works so well in practice that it is used as default in virtually all widely used machine learning software is {\em random reshuffling (RR)}. However, the practical benefits of RR have until very recently been eluding attempts at being satisfactorily explained using theory. Motivated by recent development due to Mishchenko, Khaled and Richt\'{a}rik (2020), in this work we provide the first analysis of SVRG under Random Reshuffling (RR-SVRG) for general finite-sum problems. First, we show that RR-SVRG converges linearly with the rate $\mathcal{O}(\kappa^{3/2})$ in the strongly-convex case, and can be improved further to $\mathcal{O}(\kappa)$ in the big data regime (when $n > \mathcal{O}(\kappa)$), where $\kappa$ is the condition number. This improves upon the previous best rate $\mathcal{O}(\kappa^2)$ known for a variance reduced RR method in the strongly-convex case due to Ying, Yuan and Sayed (2020). Second, we obtain the first sublinear rate for general convex problems. Third, we establish similar fast rates for Cyclic-SVRG and Shuffle-Once-SVRG. Finally, we develop and analyze a more general variance reduction scheme for RR, which allows for less frequent updates of the control variate. We corroborate our theoretical results with suitably chosen experiments on synthetic and real datasets.
• #### Multi-index ensemble Kalman filtering

(arXiv, 2021-04-15) [Preprint]
In this work we marry multi-index Monte Carlo with ensemble Kalman filtering (EnKF) to produce the multi-index EnKF method (MIEnKF). The MIEnKF method is based on independent samples of four-coupled EnKF estimators on a multi-index hierarchy of resolution levels, and it may be viewed as an extension of the multilevel EnKF (MLEnKF) method developed by the same authors in 2020. Multi-index here refers to a two-index method, consisting of a hierarchy of EnKF estimators that are coupled in two degrees of freedom: time discretization and ensemble size. Under certain assumptions, the MIEnKF method is proven to be more tractable than EnKF and MLEnKF, and this is also verified in numerical examples.
• #### Optimized Runge-Kutta Methods with Automatic Step Size Control for Compressible Computational Fluid Dynamics

(arXiv, 2021-04-14) [Preprint]
We develop error-control based time integration algorithms for compressible fluid dynamics (CFD) applications and show that they are efficient and robust in both the accuracy-limited and stability-limited regime. Focusing on discontinuous spectral element semidiscretizations, we design new controllers for existing methods and for some new embedded Runge-Kutta pairs. We demonstrate the importance of choosing adequate controller parameters and provide a means to obtain these in practice. We compare a wide range of error-control-based methods, along with the common approach in which step size control is based on the Courant-Friedrichs-Lewy (CFL) number. The optimized methods give improved performance and naturally adopt a step size close to the maximum stable CFL number at loose tolerances, while additionally providing control of the temporal error at tighter tolerances. The numerical examples include challenging industrial CFD applications.
• #### Aligning Latent and Image Spaces to Connect the Unconnectable

(arXiv, 2021-04-14) [Preprint]
In this work, we develop a method to generate infinite high-resolution images with diverse and complex content. It is based on a perfectly equivariant generator with synchronous interpolations in the image and latent spaces. Latent codes, when sampled, are positioned on the coordinate grid, and each pixel is computed from an interpolation of the nearby style codes. We modify the AdaIN mechanism to work in such a setup and train the generator in an adversarial setting to produce images positioned between any two latent vectors. At test time, this allows for generating complex and diverse infinite images and connecting any two unrelated scenes into a single arbitrarily large panorama. Apart from that, we introduce LHQ: a new dataset of \lhqsize high-resolution nature landscapes. We test the approach on LHQ, LSUN Tower and LSUN Bridge and outperform the baselines by at least 4 times in terms of quality and diversity of the produced infinite images. The project page is located at https://universome.github.io/alis.
• #### Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts

(arXiv, 2021-04-14) [Preprint]
• #### Wavefield reconstruction inversion via physics-informed neural networks

(arXiv, 2021-04-14) [Preprint]
Wavefield reconstruction inversion (WRI) formulates a PDE-constrained optimization problem to reduce cycle skipping in full-waveform inversion (FWI). WRI often requires expensive matrix inversions to reconstruct frequency-domain wavefields. Physics-informed neural network (PINN) uses the underlying physical laws as loss functions to train the neural network (NN), and it has shown its effectiveness in solving the Helmholtz equation and generating Green's functions, specifically for the scattered wavefield. By including a data-constrained term in the loss function, the trained NN can reconstruct a wavefield that simultaneously fits the recorded data and satisfies the Helmholtz equation for a given initial velocity model. Using the predicted wavefields, we rely on a small-size NN to predict the velocity using the reconstructed wavefield. In this velocity prediction NN, spatial coordinates are used as input data to the network and the scattered Helmholtz equation is used to define the loss function. After we train this network, we are able to predict the velocity in the domain of interest. We develop this PINN-based WRI method and demonstrate its potential using a part of the Sigsbee2A model and a modified Marmousi model. The results show that the PINN-based WRI is able to invert for a reasonable velocity with very limited iterations and frequencies, which can be used in a subsequent FWI application.
• #### Big Communications: Connect the Unconnected

(arXiv, 2021-04-13) [Preprint]
In this article, we present the analysis of the digital divide to illustrate the unfair access to the benefits brought by information and communications technology (ICT) over the globe and provide our solution termed big communications (BigCom) to close the digital divide and democratize the benefits of ICT. To facilitate the implementation of BigCom, we give a complete framework of BigCom considering both technological and non-technological factors as well as a set of suggestions for content providers, mobile network operators, and governments. By implementing BigCom, we aim to connect the last four billion unconnected people living in far-flung and underdeveloped areas and achieve the goal of global and ubiquitous connectivity for everyone in the 6G era.
• #### A class of high-order weighted compact central schemes for solving hyperbolic conservation laws

(arXiv, 2021-04-09) [Preprint]
We propose a class of weighted compact central (WCC) schemes for solving hyperbolic conservation laws. The linear version can be considered as a high-order extension of the central Lax-Friedrichs (LxF) scheme and the central conservation element and solution element (CESE) scheme. On every cell, the solution is approximated by a Pth order polynomial of which all the DOFs are stored and updated separately. The cell average is updated by a classical finite volume scheme which is constructed based on space-time staggered meshes such that the fluxes are continuous across the interfaces of the adjacent control volumes and, therefore, the local Riemann problem is bypassed. The kth order spatial derivatives are updated by a central difference of (k-1)th order spatial derivatives at cell vertices. All the space-time information is calculated by the Cauchy-Kovalewski procedure. By doing so, the schemes are able to achieve arbitrarily uniform spacetime high order on a super-compact stencil with only one explicit time step. In order to capture discontinuities without spurious oscillations, a weighted essentially non-oscillatory (WENO) type limiter is tailor-made for the schemes. The limiter preserves the compactness and high order accuracy of the schemes. The accuracy, robustness, and efficiency of the schemes are verified by several numerical examples of scalar conservation laws and the compressible Euler equations.
• #### Massive Access in Media Modulation Based Massive Machine-Type Communications

(arXiv, 2021-04-08) [Preprint]
The massive machine-type communications (mMTC) paradigm based on media modulation in conjunction with massive MIMO base stations (BSs) is emerging as a viable solution to support the massive connectivity for the future Internet-of-Things, in which the inherent massive access at the BSs poses significant challenges for device activity and data detection (DADD). This paper considers the DADD problem for both uncoded and coded media modulation based mMTC with a slotted access frame structure, where the device activity remains unchanged within one frame. Specifically, due to the slotted access frame structure and the adopted media modulated symbols, the access signals exhibit a doubly structured sparsity in both the time domain and the modulation domain. Inspired by this, a doubly structured approximate message passing (DS-AMP) algorithm is proposed for reliable DADD in the uncoded case. Also, we derive the state evolution of the DS-AMP algorithm to theoretically characterize its performance. As for the coded case, we develop a bit-interleaved coded media modulation scheme and propose an iterative DS-AMP (IDS-AMP) algorithm based on successive inference cancellation (SIC), where the signal components associated with the detected active devices are successively subtracted to improve the data decoding performance. In addition, the channel estimation problem for media modulation based mMTC is discussed and an efficient data-aided channel state information (CSI) update strategy is developed to reduce the training overhead in block fading channels. Finally, simulation results and computational complexity analysis verify the superiority of the proposed algorithms in both uncoded and coded cases. Also, our results verify the validity of the proposed data-aided CSI update strategy.
• #### Distributed Resource Management in Downlink Cache-enabled Multi-cloud Radio Access Networks

(arXiv, 2021-04-08) [Preprint]
In the scope of beyond fifth generation (B5G) networks and the massive increase of data-hungry systems, the need of extending conventional single-cloud radio access networks (C-RAN) arises. A compound of several clouds, jointly managing inter-cloud and intra-cloud interference, constitutes a practical solution to cope with requirements of B5G networks. This paper considers a multi-cloud radio access network model (MC-RAN) where each cloud is connected to a distinct set of base stations (BSs) via limited capacity fronthaul links. The BSs are equipped with local cache storage and base-band processing capabilities, as a means to alleviate the fronthaul congestion problem. The paper then investigates the problem of jointly assigning users to clouds and determining their beamforming vectors so as to maximize the network-wide energy efficiency (EE) subject to fronthaul capacity, and transmit power constraints. This paper solves such a mixed discrete-continuous, non-convex optimization problem using fractional programming (FP) and successive inner-convex approximation (SICA) techniques to deal with the non-convexity of the continuous part of the problem, and $l_0$-norm approximation to account for the binary association part. A highlight of the proposed algorithm is its capability of being implemented in a distributed fashion across the network multiple clouds through a reasonable amount of information exchange. The numerical simulations illustrate the pronounced role the proposed algorithm plays in alleviating the interference of large-scale MC-RANs, especially in dense networks.
• #### On Telecommunication Service Imbalance and Infrastructure Resource Deployment

(arXiv, 2021-04-08) [Preprint]
The digital divide restricting the access of people living in developing areas to the benefits of modern information and communications technologies has become a major challenge and research focus. To well understand and finally bridge the digital divide, we first need to discover a proper measure to characterize and quantify the telecommunication service imbalance. In this regard, we propose a fine-grained and easy-to-compute imbalance index, aiming to quantitatively link the relation among telecommunication service imbalance, telecommunication infrastructure, and demographic distribution. The mathematically elegant and generic form of the imbalance index allows consistent analyses for heterogeneous scenarios and can be easily tailored to incorporate different telecommunication policies and application scenarios. Based on this index, we also propose an infrastructure resource deployment strategy by minimizing the average imbalance index of any geographical segment. Experimental results verify the effectiveness of the proposed imbalance index by showing a high degree of correlation to existing congeneric but coarse-grained measures and the superiority of the infrastructure resource deployment strategy.
• #### Quaternion Factorization Machines: A Lightweight Solution to Intricate Feature Interaction Modelling

(arXiv, 2021-04-05) [Preprint]
As a well-established approach, factorization machine (FM) is capable of automatically learning high-order interactions among features to make predictions without the need for manual feature engineering. With the prominent development of deep neural networks (DNNs), there is a recent and ongoing trend of enhancing the expressiveness of FM-based models with DNNs. However, though better results are obtained with DNN-based FM variants, such performance gain is paid off by an enormous amount (usually millions) of excessive model parameters on top of the plain FM. Consequently, the heavy parameterization impedes the real-life practicality of those deep models, especially efficient deployment on resource-constrained IoT and edge devices. In this paper, we move beyond the traditional real space where most deep FM-based models are defined, and seek solutions from quaternion representations within the hypercomplex space. Specifically, we propose the quaternion factorization machine (QFM) and quaternion neural factorization machine (QNFM), which are two novel lightweight and memory-efficient quaternion-valued models for sparse predictive analytics. By introducing a brand new take on FM-based models with the notion of quaternion algebra, our models not only enable expressive inter-component feature interactions, but also significantly reduce the parameter size due to lower degrees of freedom in the hypercomplex Hamilton product compared with real-valued matrix multiplication. Extensive experimental results on three large-scale datasets demonstrate that QFM achieves 4.36% performance improvement over the plain FM without introducing any extra parameters, while QNFM outperforms all baselines with up to two magnitudes' parameter size reduction in comparison to state-of-the-art peer methods.
• #### Finding Nano-Ötzi: Semi-Supervised Volume Visualization for Cryo-Electron Tomography

(arXiv, 2021-04-04) [Preprint]
Cryo-Electron Tomography (cryo-ET) is a new 3D imaging technique with unprecedented potential for resolving submicron structural detail. Existing volume visualization methods, however, cannot cope with its very low signal-to-noise ratio. In order to design more powerful transfer functions, we propose to leverage soft segmentation as an explicit component of visualization for noisy volumes. Our technical realization is based on semi-supervised learning where we combine the advantages of two segmentation algorithms. A first weak segmentation algorithm provides good results for propagating sparse user provided labels to other voxels in the same volume. This weak segmentation algorithm is used to generate dense pseudo labels. A second powerful deep-learning based segmentation algorithm can learn from these pseudo labels to generalize the segmentation to other unseen volumes, a task that the weak segmentation algorithm fails at completely. The proposed volume visualization uses the deep-learning based segmentation as a component for segmentation-aware transfer function design. Appropriate ramp parameters can be suggested automatically through histogram analysis. Finally, our visualization uses gradient-free ambient occlusion shading to further suppress visual presence of noise, and to give structural detail desired prominence. The cryo-ET data studied throughout our technical experiments is based on the highest-quality tilted series of intact SARS-CoV-2 virions. Our technique shows the high impact in target sciences for visual data analysis of very noisy volumes that cannot be visualized with existing techniques.
• #### Uniting Heterogeneity, Inductiveness, and Efficiency for Graph Representation Learning

(arXiv, 2021-04-04) [Preprint]
With the ubiquitous graph-structured data in various applications, models that can learn compact but expressive vector representations of nodes have become highly desirable. Recently, bearing the message passing paradigm, graph neural networks (GNNs) have greatly advanced the performance of node representation learning on graphs. However, a majority class of GNNs are only designed for homogeneous graphs, leading to inferior adaptivity to the more informative heterogeneous graphs with various types of nodes and edges. Also, despite the necessity of inductively producing representations for completely new nodes (e.g., in streaming scenarios), few heterogeneous GNNs can bypass the transductive learning scheme where all nodes must be known during training. Furthermore, the training efficiency of most heterogeneous GNNs has been hindered by their sophisticated designs for extracting the semantics associated with each meta path or relation. In this paper, we propose WIde and DEep message passing Network (WIDEN) to cope with the aforementioned problems about heterogeneity, inductiveness, and efficiency that are rarely investigated together in graph representation learning. In WIDEN, we propose a novel inductive, meta path-free message passing scheme that packs up heterogeneous node features with their associated edges from both low- and high-order neighbor nodes. To further improve the training efficiency, we innovatively present an active downsampling strategy that drops unimportant neighbor nodes to facilitate faster information propagation. Experiments on three real-world heterogeneous graphs have further validated the efficacy of WIDEN on both transductive and inductive node representation learning, as well as the superior training efficiency against state-of-the-art baselines.
• #### PINNtomo: Seismic tomography using physics-informed neural networks

(arXiv, 2021-04-04) [Preprint]
Seismic traveltime tomography using transmission data is widely used to image the Earth's interior from global to local scales. In seismic imaging, it is used to obtain velocity models for subsequent depth-migration or full-waveform inversion. In addition, cross-hole tomography has been successfully applied for a variety of applications, including mineral exploration, reservoir monitoring, and CO2 injection and sequestration. Conventional tomography techniques suffer from a number of limitations, including the use of a smoothing regularizer that is agnostic to the physics of wave propagation. Here, we propose a novel tomography method to address these challenges using developments in the field of scientific machine learning. Using seismic traveltimes observed at seismic stations covering part of the computational model, we train neural networks to approximate the traveltime factor and the velocity fields, subject to the physics-informed regularizer formed by the factored eikonal equation. This allows us to better compensate for the ill-posedness of the tomography problem compared to conventional methods and results in a number of other attractive features, including computational efficiency. We show the efficacy of the proposed method and its capabilities through synthetic tests for surface seismic and cross-hole geometries. Contrary to conventional techniques, we find the performance of the proposed method to be agnostic to the choice of the initial velocity model.
• #### Sparse Code Multiple Access for 6G Wireless Communication Networks: Recent Advances and Future Directions

(arXiv, 2021-04-03) [Preprint]
As 5G networks rolling out in many different countries nowadays, the time has come to investigate how to upgrade and expand them towards 6G, where the latter is expected to realize the interconnection of everything as well as the development of a ubiquitous intelligent mobile world for intelligent life. To enable this epic leap in communications, this article provides an overview and outlook on the application of sparse code multiple access (SCMA) for 6G wireless communication systems, which is an emerging disruptive non-orthogonal multiple access (NOMA) scheme for the enabling of massive connectivity. We propose to apply SCMA to a massively distributed access system (MDAS), whose architecture is based on fiber-based visible light communication (FVLC), ultra-dense network (UDN), and NOMA. Under this framework, we consider the interactions between optical front-hauls and wireless access links. In order to stimulate more upcoming research in this area, we outline a number of promising directions associated with SCMA for faster, more reliable, and more efficient multiple access in future 6G communication networks.
• #### On the Influence of Charging Stations Spatial Distribution on Aerial Wireless Networks

(arXiv, 2021-04-03) [Preprint]
Using drones for cellular coverage enhancement is a recent technology that has shown a great potential in various practical scenarios. However, one of the main challenges that limits the performance of drone-enabled wireless networks is the limited flight time. In particular, due to the limited on-board battery size, the drone needs to frequently interrupt its operation and fly back to a charging station to recharge/replace its battery. In addition, the charging station might be responsible to recharge multiple drones. Given that the charging station has limited capacity, it can only serve a finite number of drones simultaneously. Hence, in order to accurately capture the influence of the battery limitation on the performance, it is required to analyze the dynamics of the time spent by the drones at the charging stations. In this paper, we use tools from queuing theory and stochastic geometry to study the influence of each of the charging stations limited capacity and spatial density on the performance of a drone-enabled wireless network.
• #### Fast-adapting and Privacy-preserving Federated Recommender System

(arXiv, 2021-04-02) [Preprint]
In the mobile Internet era, recommender systems have become an irreplaceable tool to help users discover useful items, thus alleviating the information overload problem. Recent research on deep neural network (DNN)-based recommender systems have made significant progress in improving prediction accuracy, largely attributed to the widely accessible large-scale user data. Such data is commonly collected from users' personal devices, and then centrally stored in the cloud server to facilitate model training. However, with the rising public concerns on user privacy leakage in online platforms, online users are becoming increasingly anxious over abuses of user privacy. Therefore, it is urgent and beneficial to develop a recommender system that can achieve both high prediction accuracy and strong privacy protection. To this end, we propose a DNN-based recommendation model called PrivRec running on the decentralized federated learning (FL) environment, which ensures that a user's data is fully retained on her/his personal device while contributing to training an accurate model. On the other hand, to better embrace the data heterogeneity (e.g., users' data vary in scale and quality significantly) in FL, we innovatively introduce a first-order meta-learning method that enables fast on-device personalization with only a few data points. Furthermore, to defend against potential malicious participants that pose serious security threat to other users, we further develop a user-level differentially private model, namely DP-PrivRec, so attackers are unable to identify any arbitrary user from the trained model. Finally, we conduct extensive experiments on two large-scale datasets in a simulated FL environment, and the results validate the superiority of both PrivRec and DP-PrivRec.
• #### Spectral Dependence

(arXiv, 2021-03-31) [Preprint]
This paper presents a general framework for modeling dependence in multivariate time series. Its fundamental approach relies on decomposing each signal in a system into various frequency components and then studying the dependence properties through these oscillatory activities.The unifying theme across the paper is to explore the strength of dependence and possible lead-lag dynamics through filtering. The proposed framework is capable of representing both linear and non-linear dependencies that could occur instantaneously or after some delay(lagged dependence). Examples for studying dependence between oscillations are illustrated through multichannel electroencephalograms. These examples emphasized that some of the most prominent frequency domain measures such as coherence, partial coherence,and dual-frequency coherence can be derived as special cases under this general framework.This paper also introduces related approaches for modeling dependence through phase-amplitude coupling and causality of (one-sided) filtered signals.