• A Batch-Incremental Video Background Estimation Model using Weighted Low-Rank Approximation of Matrices

      Dutta, Aritra; Li, Xin; Richtarik, Peter (arXiv, 2017-07-02)
      Principal component pursuit (PCP) is a state-of-the-art approach for background estimation problems. Due to their higher computational cost, PCP algorithms, such as robust principal component analysis (RPCA) and its variants, are not feasible in processing high definition videos. To avoid the curse of dimensionality in those algorithms, several methods have been proposed to solve the background estimation problem in an incremental manner. We propose a batch-incremental background estimation model using a special weighted low-rank approximation of matrices. Through experiments with real and synthetic video sequences, we demonstrate that our method is superior to the state-of-the-art background estimation algorithms such as GRASTA, ReProCS, incPCP, and GFL.
    • Bayesian Modeling of Air Pollution Extremes Using Nested Multivariate Max-Stable Processes

      Vettori, Sabrina; Huser, Raphaël; Genton, Marc G. (arXiv, 2018-03-18)
      Capturing the potentially strong dependence among the peak concentrations of multiple air pollutants across a spatial region is crucial for assessing the related public health risks. In order to investigate the multivariate spatial dependence properties of air pollution extremes, we introduce a new class of multivariate max-stable processes. Our proposed model admits a hierarchical tree-based formulation, in which the data are conditionally independent given some latent nested $\alpha$-stable random factors. The hierarchical structure facilitates Bayesian inference and offers a convenient and interpretable characterization. We fit this nested multivariate max-stable model to the maxima of air pollution concentrations and temperatures recorded at a number of sites in the Los Angeles area, showing that the proposed model succeeds in capturing their complex tail dependence structure.
    • Bayesian Parameter Estimation via Filtering and Functional Approximations

      Matthies, Hermann G.; Litvinenko, Alexander; Rosic, Bojana V.; Zander, Elmar (arXiv, 2016-11-25)
      The inverse problem of determining parameters in a model by comparing some output of the model with observations is addressed. This is a description for what hat to be done to use the Gauss-Markov-Kalman filter for the Bayesian estimation and updating of parameters in a computational model. This is a filter acting on random variables, and while its Monte Carlo variant --- the Ensemble Kalman Filter (EnKF) --- is fairly straightforward, we subsequently only sketch its implementation with the help of functional representations.
    • Blind Measurement Selection: A Random Matrix Theory Approach

      Elkhalil, Khalil; Kammoun, Abla; Al-Naffouri, Tareq Y.; Alouini, Mohamed-Slim (arXiv, 2016-12-14)
      This paper considers the problem of selecting a set of $k$ measurements from $n$ available sensor observations. The selected measurements should minimize a certain error function assessing the error in estimating a certain $m$ dimensional parameter vector. The exhaustive search inspecting each of the $n\choose k$ possible choices would require a very high computational complexity and as such is not practical for large $n$ and $k$. Alternative methods with low complexity have recently been investigated but their main drawbacks are that 1) they require perfect knowledge of the measurement matrix and 2) they need to be applied at the pace of change of the measurement matrix. To overcome these issues, we consider the asymptotic regime in which $k$, $n$ and $m$ grow large at the same pace. Tools from random matrix theory are then used to approximate in closed-form the most important error measures that are commonly used. The asymptotic approximations are then leveraged to select properly $k$ measurements exhibiting low values for the asymptotic error measures. Two heuristic algorithms are proposed: the first one merely consists in applying the convex optimization artifice to the asymptotic error measure. The second algorithm is a low-complexity greedy algorithm that attempts to look for a sufficiently good solution for the original minimization problem. The greedy algorithm can be applied to both the exact and the asymptotic error measures and can be thus implemented in blind and channel-aware fashions. We present two potential applications where the proposed algorithms can be used, namely antenna selection for uplink transmissions in large scale multi-user systems and sensor selection for wireless sensor networks. Numerical results are also presented and sustain the efficiency of the proposed blind methods in reaching the performances of channel-aware algorithms.
    • Butterflyfishes as a System for Investigating Pair Bonding

      Nowicki, Jessica; O'Connell, Lauren; Cowman, Peter F; Walker, Stefan; Coker, Darren James; Pratchett, Morgan (Cold Spring Harbor Laboratory, 2017-11-14)
      For many animals, affiliative relationships such as pair bonds form the foundation of society, and are highly adaptive. Animal systems amenable for comparatively studying pair bonding are important for identifying underlying biological mechanisms, but mostly exist in mammals. Better establishing fish systems will enable comparison of pair bonding mechanisms across taxonomically distant lineages that may reveal general underlying principles. We examined the utility of wild butterflyfishes (f: Chaetodontidae; g: Chaetodon) for comparatively studying pair bonding. Stochastic character mapping inferred that within the family, pairing is ancestral, with at least seven independent transitions to group formation and seven transition to solitary behavior from the late Miocene to recent. In six sympatric and wide-spread species representing a clade with one ancestrally reconstructed transition from paired to solitary grouping, we then verified social systems at Lizard Island, Australia. In situ observations confirmed that Chaetodon baronessa, C. lunulatus, and C. vagabundus are predominantly pair bonding, whereas C. rainfordi, C. plebeius, and C. trifascialis are predominantly solitary. Even in the predominantly pair bonding species, C. lunulatus, a proportion of adults (15 %) are solitary. Importantly, inter- and intra-specific differences in social systems do not co-vary with other previously established attributes (geographic occurrence, parental care, diet, or territoriality). Hence, the proposed butterflyfish populations are promising for comparative analyses of pair bonding and its mechanistic underpinnings. Avenues for further developing the system are proposed, including determining whether the utility of these species applies across their geographic disruptions.
    • Characterization of generalized Orlicz spaces

      Ferreira, Rita; Hästö, Peter; Ribeiro, Ana Margarida (arXiv, 2016-12-14)
      The norm in classical Sobolev spaces can be expressed as a difference quotient. This expression can be used to generalize the space to the fractional smoothness case. Because the difference quotient is based on shifting the function, it cannot be used in generalized Orlicz spaces. In its place, we introduce a smoothed difference quotient and show that it can be used to characterize the generalized Orlicz-Sobolev space. Our results are new even in Orlicz spaces and variable exponent spaces.
    • Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome evolution between two wheat cultivars

      Thind, Anupriya Kaur; Wicker, Thomas; Mueller, Thomas; Ackermann, Patrick M; Steuernagel, Burkhard; Wulff, Brande B.H; Spannagl, Manuel; Twardziok, Sven O; Felder, Marius; Lux, Thomas; Mayer, Klaus F.X; Keller, Beat; Krattinger, Simon G.; International Wheat Genome Sequencing Consortium (Cold Spring Harbor Laboratory, 2018-02-08)
      Background: Recent improvements in DNA sequencing and genome scaffolding have paved the way to generate high-quality de novo assemblies of pseudomolecules representing complete chromosomes of wheat and its wild relatives. These assemblies form the basis to compare the evolutionary dynamics of wheat genomes on a megabase-scale. Results: Here, we provide a comparative sequence analysis of the 700-megabase chromosome 2D between two bread wheat genotypes, the old landrace Chinese Spring and the elite Swiss spring wheat line CH Campala Lr22a. There was a high degree of sequence conservation between the two chromosomes. Analysis of large structural variations revealed four large insertions/deletions (InDels) of >100 kb. Based on the molecular signatures at the breakpoints, unequal crossing over and double-strand break repair were identified as the evolutionary mechanisms that caused these InDels. Three of the large InDels affected copy number of NLRs, a gene family involved in plant immunity. Analysis of single nucleotide polymorphism (SNP) density revealed three haploblocks of 8 Mb, 9 Mb and 48 Mb with a 35-fold increased SNP density compared to the rest of the chromosome. Conclusions: This comparative analysis of two high-quality chromosome assemblies enabled a comprehensive assessment of large structural variations. The insight obtained from this analysis will form the basis of future wheat pan-genome studies.
    • Clustering in Hilbert simplex geometry

      Nielsen, Frank; Sun, Ke (arXiv, 2017-04-03)
      Clustering categorical distributions in the probability simplex is a fundamental primitive often met in applications dealing with histograms or mixtures of multinomials. Traditionally, the differential-geometric structure of the probability simplex has been used either by (i) setting the Riemannian metric tensor to the Fisher information matrix of the categorical distributions, or (ii) defining the information-geometric structure induced by a smooth dissimilarity measure, called a divergence. In this paper, we introduce a novel computationally-friendly non-Riemannian framework for modeling the probability simplex: Hilbert simplex geometry. We discuss the pros and cons of those three statistical modelings, and compare them experimentally for clustering tasks.
    • Coarse Grained Exponential Variational Autoencoders

      Sun, Ke; Zhang, Xiangliang (arXiv, 2017-02-25)
      Variational autoencoders (VAE) often use Gaussian or category distribution to model the inference process. This puts a limit on variational learning because this simplified assumption does not match the true posterior distribution, which is usually much more sophisticated. To break this limitation and apply arbitrary parametric distribution during inference, this paper derives a \emph{semi-continuous} latent representation, which approximates a continuous density up to a prescribed precision, and is much easier to analyze than its continuous counterpart because it is fundamentally discrete. We showcase the proposition by applying polynomial exponential family distributions as the posterior, which are universal probability density function generators. Our experimental results show consistent improvements over commonly used VAE models.
    • Coherence-based Time Series Clustering for Brain Connectivity Visualization

      Euan, Carolina; Sun, Ying; Ombao, Hernando (arXiv, 2017-11-19)
      We develop the hierarchical cluster coherence (HCC) method for brain signals, a procedure for characterizing connectivity in a network by clustering nodes or groups of channels that display high level of coordination as measured by
    • Combining finite element and finite difference methods for isotropic elastic wave simulations in an energy-conserving manner

      Gao, Longfei; Keyes, David E. (arXiv, 2018-02-22)
      We consider numerical simulation of the isotropic elastic wave equations arising from seismic applications with non-trivial land topography. The more flexible finite element method is applied to the shallow region of the simulation domain to account for the topography, and combined with the more efficient finite difference method that is applied to the deep region of the simulation domain. We demonstrate that these two discretization methods, albeit starting from different formulations of the elastic wave equation, can be joined together smoothly via weakly imposed interface conditions. Discrete energy analysis is employed to derive the proper interface treatment, leading to an overall discretization that is energy-conserving. Numerical examples are presented to demonstrate the efficacy of the proposed interface treatment.
    • Comparative study of electronic and magnetic properties of Pc ( = Fe, Co) molecules physisorbed on 2D MoS and graphene

      Haldar, Soumyajyoti; Bhandary, Sumanta; Vovusha, Hakkim; Sanyal, Biplab (arXiv, 2017-09-13)
      In this paper, we have done a comparative study of electronic and magnetic properties of iron phthalocyanine (FePc) and cobalt phthalocyanine (CoPc) molecules physisorbed on monolayer of MoS$_2$ and graphene by using density functional theory. Various different types of physisorption sites have been considered for both surfaces. Our calculations reveal that the $M$Pc molecules prefer the S-top position on MoS$_2$. However, on graphene, FePc molecule prefers the bridge position while CoPc molecule prefers the top position. The $M$Pc molecules are physisorbed strongly on the MoS$_2$ surface than the graphene ($\sim$ 2.5 eV higher physisorption energy). Analysis of magnetic properties indicates the presence of strong spin dipole moment opposite to the spin moment and hence a huge reduction of effective spin moment can be observed. Our calculations of magnetic anisotropy energies using both variational approach and $2^{nd}$ order perturbation approach indicate no significant changes after physisorption. In case of FePc, an out-of-plane easy axis and in case of CoPc, an in-plane easy axis can be seen. Calculations of work function indicate a reduction of MoS$_2$ work function $\sim$ 1 eV due to physisorption of $M$Pc molecules while it does not change significantly in case of graphene.
    • Conservation laws arising in the study of forward-forward Mean-Field Games

      Gomes, Diogo A.; Nurbekyan, Levon; Sedjro, Marc (arXiv, 2017-04-24)
      We consider forward-forward Mean Field Game (MFG) models that arise in numerical approximations of stationary MFGs. First, we establish a link between these models and a class of hyperbolic conservation laws as well as certain nonlinear wave equations. Second, we investigate existence and long-time behavior of solutions for such models.
    • Conservation of energy for the Euler-Korteweg equations

      Dębiec, Tomasz; Gwiazda, Piotr; Świerczewska-Gwiazda, Agnieszka; Tzavaras, Athanasios (arXiv, 2017-12-30)
      In this article we study the principle of energy conservation for the Euler-Korteweg system. We formulate an Onsager-type sufficient regularity condition for weak solutions of the Euler-Korteweg system to conserve the total energy. The result applies to the system of Quantum Hydrodynamics.
    • Contextual Multi-Scale Region Convolutional 3D Network for Activity Detection

      Bai, Yancheng; Xu, Huijuan; Saenko, Kate; Ghanem, Bernard (arXiv, 2018-01-28)
      Activity detection is a fundamental problem in computer vision. Detecting activities of different temporal scales is particularly challenging. In this paper, we propose the contextual multi-scale region convolutional 3D network (CMS-RC3D) for activity detection. To deal with the inherent temporal scale variability of activity instances, the temporal feature pyramid is used to represent activities of different temporal scales. On each level of the temporal feature pyramid, an activity proposal detector and an activity classifier are learned to detect activities of specific temporal scales. Temporal contextual information is fused into activity classifiers for better recognition. More importantly, the entire model at all levels can be trained end-to-end. Our CMS-RC3D detector can deal with activities at all temporal scale ranges with only a single pass through the backbone network. We test our detector on two public activity detection benchmarks, THUMOS14 and ActivityNet. Extensive experiments show that the proposed CMS-RC3D detector outperforms state-of-the-art methods on THUMOS14 by a substantial margin and achieves comparable results on ActivityNet despite using a shallow feature extractor.
    • Continuous gene flow contributes to low global species abundance and distribution of a marine model diatom

      Rastogi, Achal; Deton-Cabanillas, Anne-Flore; Rocha Jimenez Vieira, Fabio; Veluchamy, Alaguraj; Cantrel, Catherine; Wang, Gaohong; Vanormelingen, Pieter; Bowler, Chris; Piganeau, Gwenael; Tirichine, Leila; Hu, Hanhua (Cold Spring Harbor Laboratory, 2017-08-15)
      Unlike terrestrial ecosystems where geographical isolation often leads to a restricted gene flow between species, genetic admixing in aquatic micro-eukaryotes is likely to be frequent. Diatoms inhabit marine ecosystems since the Mesozoic period and presently constitute one of the major primary producers in the world ocean. They are a highly diversified group of eukaryotic phytoplankton with estimates of up to 200,000 species. Since decades, Phaeodactylum tricornutum is used as a model diatom species to characterize the functional pathways, physiology and evolution of diatoms in general. In the current study, using whole genome sequencing of ten P. tricornutum strains, sampled at broad geospatial and temporal scales, we show a continuous dispersal and genetic admixing between geographically isolated strains. We also describe a very high level of heterozygosity and propose it to be a consequence of frequent ancestral admixture. Our finding that P. tricornutum sequences are plausibly detectable at low but broadly distributed levels in the world ocean further suggests that high admixing between geographically isolated strains may create a significant bottleneck, thus influencing their global abundance and distribution in nature. Finally, in an attempt to understand the functional implications of genetic diversity between different P. tricornutum ecotypes, we show the effects of domestication in inducing changes in the selection pressure on many genes and metabolic pathways. We propose these findings to have significant implications for understanding the genetic structure of diatom populations in nature and provide a framework to assess the genomic underpinnings of their ecological success.
    • Cooperative Charge Pumping and Enhanced Skyrmion Mobility

      Abbout, Adel; Weston, Joseph; Waintal, Xavier; Manchon, Aurelien (arXiv, 2018-04-06)
      The electronic pumping arising from the steady motion of ferromagnetic skyrmions is investigated by solving the time evolution of the Schrodinger equation implemented on a tight-binding model with the statistical physics of the many-body problem. It is shown that the ability of steadily moving skyrmions to pump large charge currents arises from their non-trivial magnetic topology, i.e. the coexistence between spin-motive force and topological Hall effect. Based on an adiabatic scattering theory, we compute the pumped current and demonstrate that it scales with the reflection coefficient of the conduction electrons against the skyrmion. Finally, we propose that such a phenomenon can be exploited in the context of racetrack devices, where the electronic pumping enhances the collective motion of the train of skyrmions.
    • Cooperative HARQ Assisted NOMA Scheme in Large-scale D2D Networks

      Shi, Zheng; Ma, Shaodan; Elsawy, Hesham; Yang, Guanghua; Alouini, Mohamed-Slim (arXiv, 2017-07-13)
      This paper develops an interference aware design for cooperative hybrid automatic repeat request (HARQ) assisted non-orthogonal multiple access (NOMA) scheme for large-scale device-to-device (D2D) networks. Specifically, interference aware rate selection and power allocation are considered to maximize long term average throughput (LTAT) and area spectral efficiency (ASE). The design framework is based on stochastic geometry that jointly accounts for the spatial interference correlation at the NOMA receivers as well as the temporal interference correlation across HARQ transmissions. It is found that ignoring the effect of the aggregate interference, or overlooking the spatial and temporal correlation in interference, highly overestimates the NOMA performance and produces misleading design insights. An interference oblivious selection for the power and/or transmission rates leads to violating the network outage constraints. To this end, the results demonstrate the effectiveness of NOMA transmission and manifest the importance of the cooperative HARQ to combat the negative effect of the network aggregate interference. For instance, comparing to the non-cooperative HARQ assisted NOMA, the proposed scheme can yield an outage probability reduction by $32$%. Furthermore, an interference aware optimal design that maximizes the LTAT given outage constraints leads to $47$% throughput improvement over HARQ-assisted orthogonal multiple access (OMA) scheme.
    • Coral reef carbonate budgets and ecological drivers in the naturally high temperature and high alkalinity environment of the Red Sea

      Roik, Anna Krystyna; Röthig, Till; Pogoreutz, Claudia; Voolstra, Christian R. (Cold Spring Harbor Laboratory, 2017-10-17)
      The coral structural framework is crucial for maintaining reef ecosystem function and services. In the central Red Sea, a naturally high alkalinity is beneficial to reef growth, but rising water temperatures impair the calcification capacity of reef-building organisms. However, it is currently unknown how beneficial and detrimental factors affect the balance between calcification and erosion, and thereby the overall growth of the reef framework. To provide insight into present-day carbonate budgets and reef growth dynamics in the central Red Sea, we measured in situ net-accretion and net-erosion rates (Gnet) by deployment of limestone blocks and estimated census-based carbonate budgets (Gbudget) in four reef sites along a cross-shelf gradient (25 km). We assessed abiotic variables (i.e., temperature, inorganic nutrients, and carbonate system variables) and biotic drivers (i.e., calcifier and bioeroder abundances). On average, total alkalinity AT (2346-2431 μmol kg-1), aragonite saturation state (4.5-5.2 Ωa), and pCO2 (283-315 μatm) were close to estimates of pre-industrial global ocean surface waters. Despite these calcification-favorable carbonate system conditions, Gnet and Gbudget encompassed positive (offshore) and negative net-production (midshore-lagoon and exposed nearshore site) estimates. Notably, Gbudget maxima were lower compared to reef growth from pristine Indian Ocean sites. Yet, a comparison with historical data from the northern Red Sea suggests that overall reef growth in the Red Sea has likely remained similar since 1995. When assessing sites across the shelf gradient, AT correlated well with reef growth rates (ρ = 0.89), while temperature was a weaker, negative correlate (ρ = -0.71). Further, AT explained about 65% of Gbudget in a best fitting distance-based linear model. Interestingly, parrotfish abundances added up to 82% of explained variation, further substantiating recent studies highlighting the importance of parrotfish to reef ecosystem function. Our study provides a baseline that will be particularly useful in assessing future trajectories of reef growth capacities in the Red Sea under continuous ocean warming and acidification.
    • CpG traffic lights are markers of regulatory regions in humans

      Khamis, Abdullah M.; Lioznova, Anna V.; Artemov, Artem V.; Ramensky, Vasily; Bajic, Vladimir B.; Medvedeva, Yulia A. (Cold Spring Harbor Laboratory Press, 2016-12-29)
      DNA methylation is involved in regulation of gene expression. Although modern methods profile DNA methylation at single CpG sites, methylation levels are usually averaged over genomic regions in the downstream analyses. In this study we demonstrate that single CpG methylation can serve as a more accurate predictor of gene expression compared to average promoter / gene body methylation. CpG positions with significant correlation between methylation and expression of a gene nearby (named CpG traffic lights) are evolutionary conserved and enriched for exact TSS positions and active enhancers. Among all promoter types, CpG traffic lights are especially enriched in poised promoters. Genes that harbor CpG traffic lights are associated with development and signal transduction. Methylation levels of individual CpG traffic lights vary between cell types dramatically with the increased frequency of intermediate methylation levels, indicating cell population heterogeneity in CpG methylation levels. Being in line with the concept of the inherited stochastic epigenetic variation, methylation of such CpG positions might contribute to transcriptional regulation. Alternatively, one can hypothesize that traffic lights are markers of absent gene expression resulting from inactivation of their regulatory elements. The CpG traffic lights provide a promising insight into mechanisms of enhancer activity and gene regulation linking methylation of single CpG to expression.