• GZMA and RASGRP1 are novel tumor suppressors that counter dissemination of Theileria annulata-transformed macrophages

      Rchiad, Zineb; Haidar, Malak; Ansari, Hifzur Rahman; Tajeri, Shahin; Ben Rached, Fathia; Langsley, Gordon; Pain, Arnab (Cold Spring Harbor Laboratory, 2018-06-05)
      Theileria annulata is a tick-transmitted apicomplexan parasite that infects and transforms bovine leukocytes into disseminating tumors that cause a disease called tropical theileriosis. Using RNA sequencing we identified bovine genes, whose transcription is perturbed during Theileria-induced transformation to define the transcriptional atlas of transformed virulent versus attenuated (dampened dissemination) macrophages and transformed B cells. Dataset comparisons highlighted a small set of novel genes associated with Theileria-transformed leukocyte dissemination and the roles of Granzyme A (GZMA) and RAS guanyl-releasing protein 1 (RASGRP1) confirmed by CRISPR/Cas9-mediated down-regulation of their expression. Knockdown of both GZMA and RASGRP1 in attenuated macrophages led to a regain in their dissemination in Rag2/γC mice confirming in vivo both GZMA and RASGRP1 as novel dissemination suppressors.
    • Physical and transcriptional organisation of the bread wheat intracellular immune receptor repertoire

      Steuernagel, Burkhard; Witek, Kamil; Krattinger, Simon G.; Ramirez-Gonzalez, Ricardo H.; Schoonbeek, Henk-jan; Yu, Guotai; Baggs, Erin; Witek, Agnieszka; Yadav, Inderjit; Krasileva, Ksenia V.; Jones, Jonathan D. G.; Uauy, Cristobal; Keller, Beat; Ridout, Christopher J.; Wulff, Brande; The International Wheat Genome Sequencing Consortium (Cold Spring Harbor Laboratory, 2018-06-05)
      Disease resistance genes encoding intracellular immune receptors of the nucleotide-binding and leucine-rich repeat (NLR) class of proteins detect pathogens by the presence of pathogen effectors. Plant genomes typically contain hundreds of NLR encoding genes. The availability of the hexaploid wheat cultivar Chinese Spring reference genome now allows a detailed study of its NLR complement. However, low NLR expression as well as high intra-family sequence homology hinders their accurate gene annotation. Here we developed NLR-Annotator for in silico NLR identification independent of transcript support. Although developed for wheat, we demonstrate the universal applicability of NLR-Annotator across diverse plant taxa. Applying our tool to wheat and combining it with a transcript-validated subset of genes from the reference gene annotation, we characterized the structure, phylogeny and expression profile of the NLR gene family. We detected 3,400 full-length NLR loci of which 1,540 were confirmed as complete genes. NLRs with integrated domains mostly group in specific sub-clades. Members of another subclade predominantly locate in close physical proximity to NLRs carrying integrated domains suggesting a paired helper-function. Most NLRs (88%) display low basal expression (in the lower 10 percentile of transcripts), which may be tissue-specific and/or induced by biotic stress. As a case study for applying our tool to the positional cloning of resistance genes, we estimated the number of NLR genes within the intervals of mapped rust resistance genes. Our study will support the identification of functional resistance genes in wheat to accelerate the breeding and engineering of disease resistant varieties.
    • Drug repurposing through joint learning on knowledge graphs and literature

      AlShahrani, Mona; Hoehndorf, Robert (Cold Spring Harbor Laboratory, 2018-08-06)
      Drug repurposing is the problem of finding new uses for known drugs, and may either involve finding a new protein target or a new indication for a known mechanism. Several computational methods for drug repurposing exist, and many of these methods rely on combinations of different sources of information, extract hand-crafted features and use a computational model to predict targets or indications for a drug. One of the distinguishing features between different drug repurposing systems is the selection of features. Recently, a set of novel machine learning methods have become available that can efficiently learn features from datasets, and these methods can be applied, among others, to text and structured data in knowledge graphs. We developed a novel method that combines information in literature and structured databases, and applies feature learning to generate vector space embeddings. We apply our method to the identification of drug targets and indications for known drugs based on heterogeneous information about drugs, target proteins, and diseases. We demonstrate that our method is able to combine complementary information from both structured databases and from literature, and we show that our method can compete with well-established methods for drug repurposing. Our approach is generic and can be applied to other areas in which multi-modal information is used to build predictive models.
    • A fast and cost-effective microsampling protocol incorporating reduced animal usage for time-series transcriptomics in rodent malaria parasites

      Ramaprasad, Abhinay; Subudhi, Amit Kumar; Culleton, Richard; Pain, Arnab (Cold Spring Harbor Laboratory, 2018-06-21)
      The transcriptional regulation occurring in malaria parasites during the clinically important life stages within host erythrocytes can be studied in vivo with rodent malaria parasites propagated in mice. Time-series transcriptome profiling commonly involves the euthanasia of groups of mice at specific time points followed by the extraction of parasite RNA from whole blood samples. Current methodologies for parasite RNA extraction involve several steps and when multiple time points are profiled, these protocols are laborious, time consuming, and require the euthanisation of large cohorts of mice. We designed a simplified protocol for parasite RNA extraction from blood volumes as low as 20 microliters (microsamples), serially bled from mice via tail snips and directly lysed with TRIzol reagent. Gene expression data derived from microsampling using RNA-seq were closely matched to those derived from larger volumes of leucocyte-depleted and saponin-treated blood obtained from euthanized mice and also tightly correlated between biological replicates. Transcriptome profiling of microsamples taken at different time points during the intra-erythrocytic developmental cycle of the rodent malaria parasite Plasmodium vinckei revealed the transcriptional cascade commonly observed in malaria parasites. Microsampling is a quick, robust and cost-efficient approach to sample collection for in vivo time-series transcriptomic studies in rodent malaria parasites.
    • Communicating Using Spatial Mode Multiplexing: Potentials, Challenges and Perspectives

      Trichili, Abderrahmen; Park, Ki-Hong; Zghal, Mouard; Ooi, Boon S.; Alouini, Mohamed-Slim (2018-08)
      Time, polarization, and wavelength multiplexing schemes have been used to satisfy the growing need of transmission capacity. Using space as a new dimension for communication systems has been recently suggested as a versatile technique to address future bandwidth issues. We review the potentials of harnessing the space as an additional degree of freedom for communication applications including free space optics, optical fiber installation, underwater wireless optical links, on-chip interconnects, data center indoor connections, radio frequency and acoustic communications. We focus on the orbital angular momentum (OAM) modes and equally identify the challenges related to each of the applications of spatial modes and the particular OAM modes in communication. Finally, we discuss the perspectives of this emerging technology.
    • Modeling of Viral Aerosol Transmission and Detection

      Khalid, Maryam; Amin, Osama; Ahmed, Sajid; Alouini, Mohamed-Slim (2018)
      The objective of this work is to investigate the spread mechanism of diseases in the atmosphere as an engineering problem. Among the viral transmission mechanisms that do not include physical contact, aerosol transmission is the most significant mode of transmission where virus-laden droplets are carried over long distances by wind. In this work, we focus on aerosol transmission of virus and introduce the idea of viewing virus transmission through aerosols and their transport as a molecular communication problem, where one has no control over transmission source but a robust receiver can be designed using nano-biosensors. To investigate this idea, a complete system is presented and end-toend mathematical model for the aerosol transmission channel is derived under certain constraints and boundary conditions. In addition to transmitter and channel, a receiver architecture composed of air sampler and Silicon Nanowire field effect transistor is also discussed. Furthermore, a detection problem is formulated for which maximum likelihood decision rule and the corresponding missed detection probability is discussed. At the end, simulation results are presented to investigate the parameters that affect the performance and justify the feasibility of proposed setup in related applications.
    • Spatial Poisson Processes for Fatigue Crack Initiation

      Babuska, Ivo; Sawlan, Zaid A; Scavino, Marco; Szabó, Barna; Tempone, Raul (2018-05-09)
    • OligoPVP: Phenotype-driven analysis of individual genomic information to prioritize oligogenic disease variants

      Boudellioua, Imene; Kulmanov, Maxat; Schofield, Paul N; Gkoutos, Georgios V; Hoehndorf, Robert (Cold Spring Harbor Laboratory, 2018-05-02)
      Purpose: An increasing number of Mendelian disorders have been identified for which two or more variants in one or more genes are required to cause the disease, or significantly modify its severity or phenotype. It is difficult to discover such interactions using existing approaches. The purpose of our work is to develop and evaluate a system that can identify combinations of variants underlying oligogenic diseases in individual whole exome or whole genome sequences. Methods: Information that links patient phenotypes to databases of gene-phenotype associations observed in clinical research can provide useful information and improve variant prioritization for Mendelian diseases. Additionally, background knowledge about interactions between genes can be utilized to guide and restrict the selection of candidate disease modules. Results: We developed OligoPVP, an algorithm that can be used to identify variants in oligogenic diseases and their interactions, using whole exome or whole genome sequences together with patient phenotypes as input. We demonstrate that OligoPVP has significantly improved performance when compared to state of the art pathogenicity detection methods. Conclusions: Our results show that OligoPVP can efficiently detect oligogenic interactions using a phenotype-driven approach and identify etiologically important variants in whole genomes.
    • DeepPVP: phenotype-based prioritization of causative variants using deep learning

      Boudellioua, Imene; Kulmanov, Maxat; Schofield, Paul N; Gkoutos, Georgios V; Hoehndorf, Robert (Cold Spring Harbor Laboratory, 2018-05-02)
      Background: Prioritization of variants in personal genomic data is a major challenge. Recently, computational methods that rely on comparing phenotype similarity have shown to be useful to identify causative variants. In these methods, pathogenicity prediction is combined with a semantic similarity measure to prioritize not only variants that are likely to be dysfunctional but those that are likely involved in the pathogenesis of a patient's phenotype. Results: We have developed DeepPVP, a variant prioritization method that combined automated inference with deep neural networks to identify the likely causative variants in whole exome or whole genome sequence data. We demonstrate that DeepPVP performs significantly better than existing methods, including phenotype-based methods that use similar features. DeepPVP is freely available at https://github.com/bio-ontology-research-group/phenomenet-vp Conclusions: DeepPVP further improves on existing variant prioritization methods both in terms of speed as well as accuracy.
    • Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

      AlShahrani, Mona; Hoehndorf, Robert (Cold Spring Harbor Laboratory, 2018-04-30)
      In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease's (or patient's) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.
    • A Matrix Splitting Method for Composite Function Minimization

      Yuan, Ganzhao; Zheng, Wei-Shi; Ghanem, Bernard (arXiv, 2016-12-07)
      Composite function minimization captures a wide spectrum of applications in both computer vision and machine learning. It includes bound constrained optimization and cardinality regularized optimization as special cases. This paper proposes and analyzes a new Matrix Splitting Method (MSM) for minimizing composite functions. It can be viewed as a generalization of the classical Gauss-Seidel method and the Successive Over-Relaxation method for solving linear systems in the literature. Incorporating a new Gaussian elimination procedure, the matrix splitting method achieves state-of-the-art performance. For convex problems, we establish the global convergence, convergence rate, and iteration complexity of MSM, while for non-convex problems, we prove its global convergence. Finally, we validate the performance of our matrix splitting method on two particular applications: nonnegative matrix factorization and cardinality regularized sparse coding. Extensive experiments show that our method outperforms existing composite function minimization techniques in term of both efficiency and efficacy.
    • Modeling soil organic carbon with Quantile Regression: Dissecting predictors' effects on carbon stocks

      Lombardo, Luigi; Saia, Sergio; Schillaci, Calogero; Mai, Paul Martin; Huser, Raphaël (arXiv, 2017-08-13)
      Soil Organic Carbon (SOC) estimation is crucial to manage both natural and anthropic ecosystems and has recently been put under the magnifying glass after the Paris agreement 2016 due to its relationship with greenhouse gas. Statistical applications have dominated the SOC stock mapping at regional scale so far. However, the community has hardly ever attempted to implement Quantile Regression (QR) to spatially predict the SOC distribution. In this contribution, we test QR to estimate SOC stock (0-30 $cm$ depth) in the agricultural areas of a highly variable semi-arid region (Sicily, Italy, around 25,000 $km2$) by using topographic and remotely sensed predictors. We also compare the results with those from available SOC stock measurement. The QR models produced robust performances and allowed to recognize dominant effects among the predictors with respect to the considered quantile. This information, currently lacking, suggests that QR can discern predictor influences on SOC stock at specific sub-domains of each predictors. In this work, the predictive map generated at the median shows lower errors than those of the Joint Research Centre and International Soil Reference, and Information Centre benchmarks. The results suggest the use of QR as a comprehensive and effective method to map SOC using legacy data in agro-ecosystems. The R code scripted in this study for QR is included.
    • Parameters and Fractional Differentiation Orders Estimation for Linear Continuous-Time Non-Commensurate Fractional Order Systems

      Belkhatir, Zehor; Laleg-Kirati, Taous-Meriem (Submitted to Elsevier, 2017-05-31)
      This paper proposes a two-stage estimation algorithm to solve the problem of joint estimation of the parameters and the fractional differentiation orders of a linear continuous-time fractional system with non-commensurate orders. The proposed algorithm combines the modulating functions and the first-order Newton methods. Sufficient conditions ensuring the convergence of the method are provided. An error analysis in the discrete case is performed. Moreover, the method is extended to the joint estimation of smooth unknown input and fractional differentiation orders. The performance of the proposed approach is illustrated with different numerical examples. Furthermore, a potential application of the algorithm is proposed which consists in the estimation of the differentiation orders of a fractional neurovascular model along with the neural activity considered as input for this model.
    • Error Probability Analysis of Hardware Impaired Systems with Asymmetric Transmission

      Javed, Sidrah; Amin, Osama; Ikki, Salama S.; Alouini, Mohamed-Slim (2018-04-26)
      Error probability study of the hardware impaired (HWI) systems highly depends on the adopted model. Recent models have proved that the aggregate noise is equivalent to improper Gaussian signals. Therefore, considering the distinct noise nature and self-interfering (SI) signals, an optimal maximum likelihood (ML) receiver is derived. This renders the conventional minimum Euclidean distance (MED) receiver as a sub-optimal receiver because it is based on the assumptions of ideal hardware transceivers and proper Gaussian noise in communication systems. Next, the average error probability performance of the proposed optimal ML receiver is analyzed and tight bounds and approximations are derived for various adopted systems including transmitter and receiver I/Q imbalanced systems with or without transmitter distortions as well as transmitter or receiver only impaired systems. Motivated by recent studies that shed the light on the benefit of improper Gaussian signaling in mitigating the HWIs, asymmetric quadrature amplitude modulation or phase shift keying is optimized and adapted for transmission. Finally, different numerical and simulation results are presented to support the superiority of the proposed ML receiver over MED receiver, the tightness of the derived bounds and effectiveness of asymmetric transmission in dampening HWIs and improving overall system performance
    • Weighted Low-Rank Approximation of Matrices and Background Modeling

      Dutta, Aritra; Li, Xin; Richtarik, Peter (arXiv, 2018-04-15)
      We primarily study a special a weighted low-rank approximation of matrices and then apply it to solve the background modeling problem. We propose two algorithms for this purpose: one operates in the batch mode on the entire data and the other one operates in the batch-incremental mode on the data and naturally captures more background variations and computationally more effective. Moreover, we propose a robust technique that learns the background frame indices from the data and does not require any training frames. We demonstrate through extensive experiments that by inserting a simple weight in the Frobenius norm, it can be made robust to the outliers similar to the $\ell_1$ norm. Our methods match or outperform several state-of-the-art online and batch background modeling methods in virtually all quantitative and qualitative measures.
    • Using Multi-Spectral UAV Imagery to Extract Tree Crop Structural Properties and Assess Pruning Effects

      Johansen, Kasper; Raharjo, Tri; McCabe, Matthew (MDPI AG, 2018-04-18)
      Unmanned aerial vehicles (UAV) provide an unprecedented capacity to monitor the development and dynamics of tree growth and structure through time. It is generally thought that the pruning of tree crops encourages new growth, has a positive effect on fruiting, makes fruit-picking easier, and may increase yield, as it increases light interception and tree crown surface area. To establish the response of pruning in an orchard of lychee trees, an assessment of changes in tree structure, i.e. tree crown perimeter, width, height, area and Plant Projective Cover (PPC), was undertaken using multi-spectral UAV imagery collected before and after a pruning event. While tree crown perimeter, width and area could be derived directly from the delineated tree crowns, height was estimated from a produced canopy height model and PPC was most accurately predicted based on the NIR band. Pre- and post-pruning results showed significant differences in all measured tree structural parameters, including an average decrease in tree crown perimeter of 1.94 m, tree crown width of 0.57 m, tree crown height of 0.62 m, tree crown area of 3.5 m2, and PPC of 14.8%. In order to provide guidance on data collection protocols for orchard management, the impact of flying height variations was also examined, offering some insight into the influence of scale and the scalability of this UAV based approach for larger orchards. The different flying heights (i.e. 30, 50 and 70 m) produced similar measurements of tree crown width and PPC, while tree crown perimeter, area and height measurements decreased with increasing flying height. Overall, these results illustrate that routine collection of multi-spectral UAV imagery can provide a means of assessing pruning effects on changes in tree structure in commercial orchards, and highlight the importance of collecting imagery with consistent flight configurations, as varying flying heights may cause changes to tree structural measurements.
    • Numerical approximation of a binary fluid-surfactant phase field model of two-phase incompressible flow

      Zhu, Guangpu; Kou, Jisheng; Sun, Shuyu; Yao, Jun; Li, Aifen (arXiv, 2018-04-17)
      In this paper, we consider the numerical approximation of a binary fluid-surfactant phase field model of two-phase incompressible flow. The nonlinearly coupled model consists of two Cahn-Hilliard type equations and incompressible Navier-Stokes equations. Using the Invariant Energy Quadratization (IEQ) approach, the governing system is transformed into an equivalent form, which allows the nonlinear potentials to be treated efficiently and semi-explicitly. we construct a first and a second-order time marching schemes, which are extremely efficient and easy-to-implement, for the transformed governing system. At each time step, the schemes involve solving a sequence of linear elliptic equations, and computations of phase variables, velocity and pressure are totally decoupled. We further establish a rigorous proof of unconditional energy stability for the semi-implicit schemes. Numerical results in both two and three dimensions are obtained, which demonstrate that the proposed schemes are accurate, efficient and unconditionally energy stable. Using our schemes, we investigate the effect of surfactants on droplet deformation and collision under a shear flow. The increase of surfactant concentration can enhance droplet deformation and inhibit droplet coalescence.
    • Current Controlled Magnetization Switching in Cylindrical Nanowires for High-Density 3D Memory Applications

      Mohammed, Hanan; Corte-León, Hector; Ivanov, Yurii P.; Lopatin, Sergei; Moreno, Julian A.; Chuvilin, Andrey; Salimath, Akshaykumar; Manchon, Aurelien; Kazakova, Olga; Kosel, Jürgen (arXiv, 2018-04-18)
      A next-generation memory device utilizing a three-dimensional nanowire system requires the reliable control of domain wall motion. In this letter, domain walls are studied in cylindrical nanowires consisting of alternating segments of cobalt and nickel. The material interfaces acting as domain wall pinning sites, are utilized in combination with current pulses, to control the position of the domain wall, which is monitored using magnetoresistance measurements. Magnetic force microscopy results further confirm the occurrence of current assisted domain wall depinning. Data bits are therefore shifted along the nanowire by sequentially pinning and depinning a domain wall between successive interfaces, a requirement necessary for race-track type memory devices. We demonstrate that the direction, amplitude and duration of the applied current pulses determine the propagation of the domain wall across pinning sites. These results demonstrate a multi-bit cylindrical nanowire device, utilizing current assisted data manipulation. The prospect of sequential pinning and depinning in these nanowires allows the bit density to increase by several Tbs, depending on the number of segments within these nanowires.
    • Existence of weak solutions to first-order stationary mean-field games with Dirichlet conditions

      Ferreira, Rita; Gomes, Diogo A.; Tada, Teruo (arXiv, 2018-04-19)
      In this paper, we study first-order stationary monotone mean-field games (MFGs) with Dirichlet boundary conditions. While for Hamilton--Jacobi equations Dirichlet conditions may not be satisfied, here, we establish the existence of solutions of MFGs that satisfy those conditions. To construct these solutions, we introduce a monotone regularized problem. Applying Schaefer's fixed-point theorem and using the monotonicity of the MFG, we verify that there exists a unique weak solution to the regularized problem. Finally, we take the limit of the solutions of the regularized problem and using Minty's method, we show the existence of weak solutions to the original MFG.
    • SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

      Giancola, Silvio; Amine, Mohieddine; Dghaily, Tarek; Ghanem, Bernard (arXiv, 2018-04-12)
      In this paper, we introduce SoccerNet, a benchmark for action spotting in soccer videos. The dataset is composed of 500 complete soccer games from six main European leagues, covering three seasons from 2014 to 2017 and a total duration of 764 hours. A total of 6,637 temporal annotations are automatically parsed from online match reports at a one minute resolution for three main classes of events (Goal, Yellow/Red Card, and Substitution). As such, the dataset is easily scalable. These annotations are manually refined to a one second resolution by anchoring them at a single timestamp following well-defined soccer rules. With an average of one event every 6.9 minutes, this dataset focuses on the problem of localizing very sparse events within long videos. We define the task of spotting as finding the anchors of soccer events in a video. Making use of recent developments in the realm of generic action recognition and detection in video, we provide strong baselines for detecting soccer events. We show that our best model for classifying temporal segments of length one minute reaches a mean Average Precision (mAP) of 67.8%. For the spotting task, our baseline reaches an Average-mAP of 49.7% for tolerances $\delta$ ranging from 5 to 60 seconds.