Recent Submissions

  • RepAHR: an improved approach for de novo repeat identification by assembly of the high-frequency reads.

    Liao, Xingyu; Gao, Xin; Zhang, Xiankai; Wu, Fang-Xiang; Wang, Jianxin (BMC bioinformatics, Springer Science and Business Media LLC, 2020-10-20) [Article]
    BACKGROUND:Repetitive sequences account for a large proportion of eukaryotes genomes. Identification of repetitive sequences plays a significant role in many applications, such as structural variation detection and genome assembly. Many existing de novo repeat identification pipelines or tools make use of assembly of the high-frequency k-mers to obtain repeats. However, a certain degree of sequence coverage is required for assemblers to get the desired assemblies. On the other hand, assemblers cut the reads into shorter k-mers for assembly, which may destroy the structure of the repetitive regions. For the above reasons, it is difficult to obtain complete and accurate repetitive regions in the genome by using existing tools. RESULTS:In this study, we present a new method called RepAHR for de novo repeat identification by assembly of the high-frequency reads. Firstly, RepAHR scans next-generation sequencing (NGS) reads to find the high-frequency k-mers. Secondly, RepAHR filters the high-frequency reads from whole NGS reads according to certain rules based on the high-frequency k-mer. Finally, the high-frequency reads are assembled to generate repeats by using SPAdes, which is considered as an outstanding genome assembler with NGS sequences. CONLUSIONS:We test RepAHR on five data sets, and the experimental results show that RepAHR outperforms RepARK and REPdenovo for detecting repeats in terms of N50, reference alignment ratio, coverage ratio of reference, mask ratio of Repbase and some other metrics.
  • Understanding High-Salt and Cold Adaptation of a Polyextremophilic Enzyme

    Karan, Ram; Mathew, Sam; Muhammad, Reyhan; Bautista, Didier B.; Vogler, Malvina M.; Eppinger, Jörg; Oliva, Romina; Cavallo, Luigi; Arold, Stefan T.; Rueping, Magnus (Microorganisms, MDPI AG, 2020-10-16) [Article]
    The haloarchaeon Halorubrum lacusprofundi is among the few polyextremophilic organisms capable of surviving in one of the most extreme aquatic environments on Earth, the Deep Lake of Antarctica (−18 °C to +11.5 °C and 21–28%, w/v salt content). Hence, H. lacusprofundi has been proposed as a model for biotechnology and astrobiology to investigate potential life beyond Earth. To understand the mechanisms that allow proteins to adapt to both salinity and cold, we structurally (including X-ray crystallography and molecular dynamics simulations) and functionally characterized the β-galactosidase from H. lacusprofundi (hla_bga). Recombinant hla_bga (produced in Haloferax volcanii) revealed exceptional stability, tolerating up to 4 M NaCl and up to 20% (v/v) of organic solvents. Despite being cold-adapted, hla_bga was also stable up to 60 °C. Structural analysis showed that hla_bga combined increased surface acidity (associated with halophily) with increased structural flexibility, fine-tuned on a residue level, for sustaining activity at low temperatures. The resulting blend enhanced structural flexibility at low temperatures but also limited protein movements at higher temperatures relative to mesophilic homologs. Collectively, these observations help in understanding the molecular basis of a dual psychrophilic and halophilic adaptation and suggest that such enzymes may be intrinsically stable and functional over an exceptionally large temperature range.
  • Interleukin-26 activates macrophages and facilitates killing of Mycobacterium tuberculosis

    Hawerkamp, Heike C.; van Geelen, Lasse; Korte, Jan; Di Domizio, Jeremy; Swidergall, Marc; Momin, Afaque Ahmad Imtiyaz; Guzmán-Vega, Francisco J.; Arold, Stefan T.; Ernst, Joachim; Gilliet, Michel; Kalscheuer, Rainer; Homey, Bernhard; Meller, Stephan (Scientific Reports, Springer Science and Business Media LLC, 2020-10-14) [Article]
    Abstract Tuberculosis-causing Mycobacterium tuberculosis (Mtb) is transmitted via airborne droplets followed by a primary infection of macrophages and dendritic cells. During the activation of host defence mechanisms also neutrophils and T helper 1 (TH1) and TH17 cells are recruited to the site of infection. The TH17 cell-derived interleukin (IL)-17 in turn induces the cathelicidin LL37 which shows direct antimycobacterial effects. Here, we investigated the role of IL-26, a TH1- and TH17-associated cytokine that exhibits antimicrobial activity. We found that both IL-26 mRNA and protein are strongly increased in tuberculous lymph nodes. Furthermore, IL-26 is able to directly kill Mtb and decrease the infection rate in macrophages. Binding of IL-26 to lipoarabinomannan might be one important mechanism in extracellular killing of Mtb. Macrophages and dendritic cells respond to IL-26 with secretion of tumor necrosis factor (TNF)-α and chemokines such as CCL20, CXCL2 and CXCL8. In dendritic cells but not in macrophages cytokine induction by IL-26 is partly mediated via Toll like receptor (TLR) 2. Taken together, IL-26 strengthens the defense against Mtb in two ways: firstly, directly due to its antimycobacterial properties and secondly indirectly by activating innate immune mechanisms.
  • Semantic similarity and machine learning with ontologies.

    Kulmanov, Maxat; Smaili, Fatima Z.; Gao, Xin; Hoehndorf, Robert (Briefings in bioinformatics, Oxford University Press (OUP), 2020-10-13) [Article]
    Ontologies have long been employed in the life sciences to formally represent and reason over domain knowledge and they are employed in almost every major biological database. Recently, ontologies are increasingly being used to provide background knowledge in similarity-based analysis and machine learning models. The methods employed to combine ontologies and machine learning are still novel and actively being developed. We provide an overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods; in particular, we outline how semantic similarity measures and ontology embeddings can exploit the background knowledge in ontologies and how ontologies can provide constraints that improve machine learning models. The methods and experiments we describe are available as a set of executable notebooks, and we also provide a set of slides and additional resources at
  • A unified linear convergence analysis of k-SVD

    Xu, Zhiqiang; Ke, Yiping; Cao, Xin; Zhou, Chunlai; Wei, Pengfei; Gao, Xin (Memetic Computing, Springer Science and Business Media LLC, 2020-10-12) [Article]
    Eigenvector computation, e.g., k-SVD for finding top-k singular subspaces, is often of central importance to many scientific and engineering tasks. There has been resurgent interest recently in analyzing relevant methods in terms of singular value gap dependence. Particularly, when the gap vanishes, the convergence of k-SVD is considered to be capped by a gap-free sub-linear rate. We argue in this work both theoretically and empirically that this is not necessarily the case, refreshing our understanding on this significant problem. Specifically, we leverage the recently proposed structured gap in a careful analysis to establish a unified linear convergence of k-SVD to one of the ground-truth solutions, regardless of what target matrix and how large target rank k are given. Theoretical results are evaluated and verified by experiments on synthetic or real data.
  • A unified linear convergence analysis of k-SVD

    Xu, Zhiqiang; Ke, Yiping; Cao, Xin; Zhou, Chunlai; Wei, Pengfei; Gao, Xin (Memetic Computing, Springer Science and Business Media LLC, 2020-10-12) [Article]
    Eigenvector computation, e.g., k-SVD for finding top-k singular subspaces, is often of central importance to many scientific and engineering tasks. There has been resurgent interest recently in analyzing relevant methods in terms of singular value gap dependence. Particularly, when the gap vanishes, the convergence of k-SVD is considered to be capped by a gap-free sub-linear rate. We argue in this work both theoretically and empirically that this is not necessarily the case, refreshing our understanding on this significant problem. Specifically, we leverage the recently proposed structured gap in a careful analysis to establish a unified linear convergence of k-SVD to one of the ground-truth solutions, regardless of what target matrix and how large target rank k are given. Theoretical results are evaluated and verified by experiments on synthetic or real data.
  • Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data.

    Wang, Chunxiang; Gao, Xin; Liu, Juntao (BMC bioinformatics, Springer Science and Business Media LLC, 2020-10-08) [Article]
    BACKGROUND:Advances in single-cell RNA-seq technology have led to great opportunities for the quantitative characterization of cell types, and many clustering algorithms have been developed based on single-cell gene expression. However, we found that different data preprocessing methods show quite different effects on clustering algorithms. Moreover, there is no specific preprocessing method that is applicable to all clustering algorithms, and even for the same clustering algorithm, the best preprocessing method depends on the input data. RESULTS:We designed a graph-based algorithm, SC3-e, specifically for discriminating the best data preprocessing method for SC3, which is currently the most widely used clustering algorithm for single cell clustering. When tested on eight frequently used single-cell RNA-seq data sets, SC3-e always accurately selects the best data preprocessing method for SC3 and therefore greatly enhances the clustering performance of SC3. CONCLUSION:The SC3-e algorithm is practically powerful for discriminating the best data preprocessing method, and therefore largely enhances the performance of cell-type clustering of SC3. It is expected to play a crucial role in the related studies of single-cell clustering, such as the studies of human complex diseases and discoveries of new cell types.
  • Regulation of kinase activity by combined action of juxtamembrane and C-terminal regions of receptors

    Lin, Chi-Chuan; Wieteska, Lukasz; Poncet-Montange, Guillaume; Suen, Kin Man; Arold, Stefan T.; ahmed, zamal; Ladbury, John Edward (Cold Spring Harbor Laboratory, 2020-10-01) [Preprint]
    Despite the kinetically-favorable, ATP-rich intracellular environment, the mechanism by which receptor tyrosine kinases (RTKs) repress activation prior to extracellular stimulation is poorly understood. RTKs are activated through a precise sequence of phosphorylation reactions starting with a tyrosine on the activation loop (A-loop) of the intracellular kinase domain (KD). This forms an essential mono-phosphorylated active intermediate state on the path to further phosphorylation of the receptor. We show that this state is subjected to stringent control imposed by the peripheral juxtamembrane (JM) and C-terminal tail (CT) regions. This entails interplay between the intermolecular interaction between JM with KD, which stabilizes the asymmetric active KD dimer, and the opposing intramolecular binding of CT to KD. A further control step is provided by the previously unobserved direct binding between JM and CT. Mutations in JM and CT sites that perturb regulation are found in numerous pathologies, revealing novel sites for potential pharmaceutical intervention.
  • Atherosclerosis Linked to Aberrant Amino Acid Metabolism and Immunosuppressive Amino Acid Catabolizing Enzymes

    Zaric, Bozidarka L.; Radovanovic, Jelena N.; Gluvic, Zoran; Stewart, Alan J.; Essack, Magbubah; Motwalli, Olaa; Gojobori, Takashi; Isenovic, Esma R. (Frontiers in Immunology, Frontiers Media SA, 2020-09-28) [Article]
    Cardiovascular disease is the leading global health concern and responsible for more deaths worldwide than any other type of disorder. Atherosclerosis is a chronic inflammatory disease in the arterial wall, which underpins several types of cardiovascular disease. It has emerged that a strong relationship exists between alterations in amino acid (AA) metabolism and the development of atherosclerosis. Recent studies have reported positive correlations between levels of branched-chain amino acids (BCAAs) such as leucine, valine, and isoleucine in plasma and the occurrence of metabolic disturbances. Elevated serum levels of BCAAs indicate a high cardiometabolic risk. Thus, BCAAs may also impact atherosclerosis prevention and offer a novel therapeutic strategy for specific individuals at risk of coronary events. The metabolism of AAs, such as L-arginine, homoarginine, and L-tryptophan, is recognized as a critical regulator of vascular homeostasis. Dietary intake of homoarginine, taurine, and glycine can improve atherosclerosis by endothelium remodeling. Available data also suggest that the regulation of AA metabolism by indoleamine 2,3-dioxygenase (IDO) and arginases 1 and 2 are mediated through various immunological signals and that immunosuppressive AA metabolizing enzymes are promising therapeutic targets against atherosclerosis. Further clinical studies and basic studies that make use of animal models are required. Here we review recent data examining links between AA metabolism and the development of atherosclerosis.
  • Haloferax profundi sp. nov. and Haloferax marisrubri sp. nov., Isolated from the Discovery Deep Brine-Seawater Interface in the Red Sea

    Zhang, Guishan; Dong, Xiaoyan; Sun, Yingjiao; Antunes, Andre; Hikmawan, Tyas I.; Haroon, Mohamed; Wang, Junru; Stingl, Ulrich (Microorganisms, MDPI AG, 2020-09-28) [Article]
    Two extremely halophilic archaeal strains, designated SB29T and SB3T, were isolated from the brine-seawater interface of Discovery Deep in the Red Sea. Cells of both strains were pleomorphic (irregular polyhedrals, ovals, and rods) and stained Gram-negative; colonies were pigmented pink. The sequence similarity of the 16S rRNA gene of strain SB29T with that of its most closely related validly described species (Hfx. sulfurifontis DSM 16227T) and that of strain SB3T with its closest validly described relative (Hfx. denitrificans ATCC 35960T) was 98.1% and 98.6%, respectively. The incomplete draft genomes of SB29T and SB3T are 3,871,125 bp and 3,904,985 bp in size, respectively, and their DNA G + C contents are 60.75% and 65.64%, respectively. The highest ANI values between the genomes of SB29T and SB3T and the most closely related genomes in GenBank were determined as 82.6% (Hfx. sulfurifontis ATCC BAA-897T, GenBank accession no. GCA_000337835.1) and 92.6% (Haloferax denitrificans ATCC 35960T, GenBank accession no. GCA_000337795.1), respectively. These data indicate that the two new isolates cannot be classified into any recognized species of the genus Haloferax, and, therefore, two novel species of the genus Haloferax are proposed: Haloferax profundi sp. nov. (type strain SB29T = JCM 19567T = CGMCC 1.14960T) and Haloferax marisrubri sp. nov. (type strain SB3T = JCM 19566T = CGMCC 1.14958T).
  • AttPNet: Attention-Based Deep Neural Network for 3D Point Set Analysis

    Yang, Yufeng; Ma, Yixiao; Zhang, Jing; Gao, Xin; Xu, Min (Sensors, MDPI AG, 2020-09-23) [Article]
    Point set is a major type of 3D structure representation format characterized by its data availability and compactness. Most former deep learning-based point set models pay equal attention to different point set regions and channels, thus having limited ability in focusing on small regions and specific channels that are important for characterizing the object of interest. In this paper, we introduce a novel model named Attention-based Point Network (AttPNet). It uses attention mechanism for both global feature masking and channel weighting to focus on characteristic regions and channels. There are two branches in our model. The first branch calculates an attention mask for every point. The second branch uses convolution layers to abstract global features from point sets, where channel attention block is adapted to focus on important channels. Evaluations on the ModelNet40 benchmark dataset show that our model outperforms the existing best model in classification tasks by 0.7% without voting. In addition, experiments on augmented data demonstrate that our model is robust to rotational perturbations and missing points. We also design a Electron Cryo-Tomography (ECT) point cloud dataset and further demonstrate our model’s ability in dealing with fine-grained structures on the ECT dataset.
  • Efficient locality-sensitive hashing over high-dimensional streaming data

    Wang, Hao; Yang, Chengcheng; Zhang, Xiangliang; Gao, Xin (Neural Computing and Applications, Springer Science and Business Media LLC, 2020-09-17) [Article]
    Approximate nearest neighbor (ANN) search in high-dimensional spaces is fundamental in many applications. Locality-sensitive hashing (LSH) is a well-known methodology to solve the ANN problem. Existing LSH-based ANN solutions typically employ a large number of individual indexes optimized for searching efficiency. Updating such indexes might be impractical when processing high-dimensional streaming data. In this paper, we present a novel disk-based LSH index that offers efficient support for both searches and updates. The contributions of our work are threefold. First, we use the write-friendly LSM-trees to store the LSH projections to facilitate efficient updates. Second, we develop a novel estimation scheme to estimate the number of required LSH functions, with which the disk storage and access costs are effectively reduced. Third, we exploit both the collision number and the projection distance to improve the efficiency of candidate selection, improving the search performance with theoretical guarantees on the result quality. Experiments on four real-world datasets show that our proposal outperforms the state-of-the-art schemes.
  • Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation

    Lü, Yongchun; Zeng, Xiangrui; Tian, Xinhui; Shi, Xiao; Wang, Hui; Zheng, Xiaohui; Liu, Xiaodong; Zhao, Xiaofang; Gao, Xin; Xu, Min (BMC Bioinformatics, Springer Science and Business Media LLC, 2020-09-17) [Article]
    Abstract Background Resolution estimation is the main evaluation criteria for the reconstruction of macromolecular 3D structure in the field of cryoelectron microscopy (cryo-EM). At present, there are many methods to evaluate the 3D resolution for reconstructed macromolecular structures from Single Particle Analysis (SPA) in cryo-EM and subtomogram averaging (SA) in electron cryotomography (cryo-ET). As global methods, they measure the resolution of the structure as a whole, but they are inaccurate in detecting subtle local changes of reconstruction. In order to detect the subtle changes of reconstruction of SPA and SA, a few local resolution methods are proposed. The mainstream local resolution evaluation methods are based on local Fourier shell correlation (FSC), which is computationally intensive. However, the existing resolution evaluation methods are based on multi-threading implementation on a single computer with very poor scalability. Results This paper proposes a new fine-grained 3D array partition method by key-value format in Spark. Our method first converts 3D images to key-value data (K-V). Then the K-V data is used for 3D array partitioning and data exchange in parallel. So Spark-based distributed parallel computing framework can solve the above scalability problem. In this distributed computing framework, all 3D local FSC tasks are simultaneously calculated across multiple nodes in a computer cluster. Through the calculation of experimental data, 3D local resolution evaluation algorithm based on Spark fine-grained 3D array partition has a magnitude change in computing speed compared with the mainstream FSC algorithm under the condition that the accuracy remains unchanged, and has better fault tolerance and scalability. Conclusions In this paper, we proposed a K-V format based fine-grained 3D array partition method in Spark to parallel calculating 3D FSC for getting a 3D local resolution density map. 3D local resolution density map evaluates the three-dimensional density maps reconstructed from single particle analysis and subtomogram averaging. Our proposed method can significantly increase the speed of the 3D local resolution evaluation, which is important for the efficient detection of subtle variations among reconstructed macromolecular structures.
  • Arginine citrullination of proteins as a specific response mechanism in Arabidopsis thaliana

    Marondedze, Claudius; Elia, Giuliano; Thomas, Ludivine; Wong, Aloysius; Gehring, Christoph A (Cold Spring Harbor Laboratory, 2020-09-13) [Preprint]
    Arginine citrullination, also referred to as arginine deimination, is a post-translational modification involved in an increasing number of physiological processes in animals, including histone modifications and transcriptional regulation, and in severe diseases such as rheumatoid arthritis and neurodegenerative conditions. It occurs when arginine side chains are deiminated and converted into side chains of the amino acid citrulline, a process catalysed by a family of Ca2+-dependent peptidyl arginine deiminases (PADs). PADs have been discovered in several mammalian species and in other vertebrates, like birds and fish, but have not been observed in bacteria, lower eukaryotes or higher plants. Here we show, firstly, that the Arabidopsis thaliana proteome does contain citrullinated proteins; secondly and importantly, that the citrullination signature changes in response to cold stress. Among the citrullinated proteins are DNA- or RNA-binding proteins thus implying a role for it the control of the transcriptional programming in plant cells. Thirdly, through sequence and structural analysis, we identify one arabidopsis protein, currently annotated as agmatine deiminase (At5g08170), as a candidate protein arginine deiminase. Finally, we show biochemical evidence that AT5G08170 can citrullinate peptides from LHP1-interacting factor 2 (AT4G00830) an RNA-binding protein that has been identified as citrullinated in cell suspension cultures of Arabidopsis thaliana roots. In addition, we show that, in vitro, agmatine deiminase can undergo auto-citrullination. In conclusion, our work established the presence of protein arginine citrullination in higher plants and assigns it a role in post-translational modifications during abiotic stress responses.
  • Rapid Evolution of Plastic-degrading Enzymes Prevalent in the Global Ocean

    Alam, Intikhab; Gasol, Josep M; Arold, Stefan T.; Gojobori, Takashi; Kamau, Allan A; Aalismail, Nojood; Martin, Cecilia; Momin, Afaque Ahmad Imtiyaz; Acinas, Silvia G; Guzmán-Vega, Francisco J.; Agusti, Susana R; Jamil, Tahira; Duarte, Carlos M. (Cold Spring Harbor Laboratory, 2020-09-09) [Preprint]
    Estimates of marine plastic stocks, a major threat to marine life, are far lower than expected from exponentially-increasing litter inputs, suggesting important loss factors. These may involve microbial degradation, as the plastic-degrading polyethylene terephthalate enzyme (PETase) has been reported in marine microbial communities. An assessment of 416 metagenomes of planktonic communities across the global ocean identifies 68 oceanic PETase variants (oPETase) that evolved from ancestral enzymes degrading polycyclic aromatic hydrocarbons. Nearly 20 oPETases have predicted efficiencies comparable to those of laboratory-optimized PETases, suggesting strong selective pressures directing the evolution of these enzymes. We found oPETases in 90.1% of samples across all oceans and depths, particularly abundant at 1,000 m depth, with a strong dominance of Pseudomonadales containing putative highly-efficient oPETase variants in the dark ocean. Enzymatic degradation may be removing plastic from the marine environment while providing a carbon source for bathypelagic microbial communities.
  • Integrated Metabolic Modeling, Culturing, and Transcriptomics Explain Enhanced Virulence of Vibrio cholerae during Coinfection with Enterotoxigenic Escherichia coli.

    Abdel-Haleem, Alyaa M.; Ravikumar, Vaishnavi; Ji, Boyang; Mineta, Katsuhiko; Gao, Xin; Nielsen, J.; Gojobori, Takashi; Mijakovic, Ivan (mSystems, American Society for Microbiology, 2020-09-09) [Article]
    Gene essentiality is altered during polymicrobial infections. Nevertheless, most studies rely on single-species infections to assess pathogen gene essentiality. Here, we use genome-scale metabolic models (GEMs) to explore the effect of coinfection of the diarrheagenic pathogen Vibrio cholerae with another enteric pathogen, enterotoxigenic Escherichia coli (ETEC). Model predictions showed that V. cholerae metabolic capabilities were increased due to ample cross-feeding opportunities enabled by ETEC. This is in line with increased severity of cholera symptoms known to occur in patients with dual infections by the two pathogens. In vitro coculture systems confirmed that V. cholerae growth is enhanced in cocultures relative to single cultures. Further, expression levels of several V. cholerae metabolic genes were significantly perturbed as shown by dual RNA sequencing (RNAseq) analysis of its cocultures with different ETEC strains. A decrease in ETEC growth was also observed, probably mediated by nonmetabolic factors. Single gene essentiality analysis predicted conditionally independent genes that are essential for the pathogen's growth in both single-infection and coinfection scenarios. Our results reveal growth differences that are of relevance to drug targeting and efficiency in polymicrobial infections.IMPORTANCE Most studies proposing new strategies to manage and treat infections have been largely focused on identifying druggable targets that can inhibit a pathogen's growth when it is the single cause of infection. In vivo, however, infections can be caused by multiple species. This is important to take into account when attempting to develop or use current antibacterials since their efficacy can change significantly between single infections and coinfections. In this study, we used genome-scale metabolic models (GEMs) to interrogate the growth capabilities of Vibrio cholerae in single infections and coinfections with enterotoxigenic E. coli (ETEC), which cooccur in a large fraction of diarrheagenic patients. Coinfection model predictions showed that V. cholerae growth capabilities are enhanced in the presence of ETEC relative to V. cholerae single infection, through cross-fed metabolites made available to V. cholerae by ETEC. In vitro, cocultures of the two enteric pathogens further confirmed model predictions showing an increased growth of V. cholerae in coculture relative to V. cholerae single cultures while ETEC growth was suppressed. Dual RNAseq analysis of the cocultures also confirmed that the transcriptome of V. cholerae was distinct during coinfection compared to single-infection scenarios where processes related to metabolism were significantly perturbed. Further, in silico gene-knockout simulations uncovered discrepancies in gene essentiality for V. cholerae growth between single infections and coinfections. Integrative model-guided analysis thus identified druggable targets that would be critical for V. cholerae growth in both single infections and coinfections; thus, designing inhibitors against those targets would provide a broader spectrum of coverage against cholera infections.
  • Phenotypic and molecular spectrum of pyridoxamine-5'-phosphate oxidase (PNPO) deficiency: a scoping review of 87 cases of PNPO deficiency.

    Alghamdi, Malak; Bashiri, Fahad A; Abdelhakim, Marwa; Adly, Nouran; Jamjoom, Dima Z; Sumaily, Khalid M; Alghanem, Bandar; Arold, Stefan T. (Clinical genetics, Wiley, 2020-09-05) [Article]
    Pyridoxamine-5'-phosphate oxidase (PNPO) deficiency is an autosomal recessive PLP-vitamin-responsive epileptic encephalopathy. The emerging feature of PNPO deficiency is the occurrence of refractory seizures in the first year of life. Prematurity and fetal distress, combined with neonatal seizures, are other associated key characteristics. The phenotype results from a dependency of pyridoxal 5'-phosphate (PLP) which regulates several enzymes in the body. We present the phenotypic and genotypic spectrum of (PNPO) deficiency based on a literature review (2002 to 2020) of reports (n = 33) of patients with confirmed PNPO deficiency (n = 87). All patients who received PLP (n = 36) showed a clinical response, with a complete dramatic PLP response with seizure cessation observed in 61% of patients. In spite of effective seizure control with PLP, approximately 56% of patients affected with PLP-dependent epilepsy suffer developmental delay/intellectual disability. There is no diagnostic biomarker, and molecular testing required for diagnosis. However, we noted that cerebrospinal fluid (CSF) PLP was low in 81%, CSF glycine was high in 80% and urinary VLA was high in 91% of the cases. We observed only a weak correlation between the severity of PNPO protein disruption and disease outcomes, indicating the importance of other factors, including seizure onset and time of therapy initiation. We found that prematurity, the delay in initiation of PLP therapy and early onset of seizures correlate with a poor neurocognitive outcome. Given the amenability of PNPO to PLP therapy for seizure control, early diagnosis is essential. This article is protected by copyright. All rights reserved.
  • EMC10 Homozygous Variant Identified in a Family with Global Developmental Delay, Mild Intellectual Disability, and Speech Delay.

    Umair, Muhammad; Ballow, Mariam; Asiri, Abdulaziz; Alyafee, Yusra; Al Tuwaijri, Abeer; Alhamoudi, Kheloud M; Aloraini, Taghrid; Abdelhakim, Marwa; Althagafi, Azza Th.; Kafkas, Senay; Alsubaie, Lamia; Alrifai, Muhammad Talal; Hoehndorf, Robert; Alfares, Ahmed; Alfadhel, Majid (Clinical genetics, Wiley, 2020-09-02) [Article]
    In recent years, several genes have been implicated in the variable disease presentation of global developmental delay (GDD) and intellectual disability (ID). The endoplasmic reticulum membrane protein complex (EMC) family is known to be involved in GDD and ID. Homozygous variants of EMC1 are associated with GDD, scoliosis, and cerebellar atrophy, indicating the relevance of this pathway for neurogenetic disorders. EMC10 is a bone marrow-derived angiogenic growth factor that plays an important role in infarct vascularization and promoting tissue repair. However, this gene has not been previously associated with human disease. Herein, we describe a Saudi family with two individuals segregating a recessive neurodevelopmental disorder. Both of the affected individuals showed mild ID, speech delay, and GDD. Whole-exome sequencing (WES) and Sanger sequencing were performed to identify candidate genes. Further, to elucidate the functional effects of the variant, quantitative real-time PCR (RT-qPCR)-based expression analysis was performed. WES revealed a homozygous splice acceptor site variant (c.679-1G > A) in EMC10 (chromosome 19q13.33) that segregated perfectly within the family. RT-qPCR showed a substantial decrease in the relative EMC10 gene expression in the patients, indicating the pathogenicity of the identified variant. For the first time in the literature, the EMC10 gene variant was associated with mild ID, speech delay, and GDD. Thus, this gene plays a key role in developmental milestones, with the potential to cause neurodevelopmental disorders in humans. This article is protected by copyright. All rights reserved.
  • A genetic barcode of SARS-CoV-2 for monitoring global distribution of different clades during the COVID-19 pandemic.

    Guan, Qingtian; Sadykov, Mukhtar; Mfarrej, Sara; Hala, Sharif; Naeem, Raeece; Nugmanova, Raushan; Al-Omari, Awad; Salih, Samer; Mutair, Abbas Al; Carr, Michael J; Hall, William W; Arold, Stefan T.; Pain, Arnab (International journal of infectious diseases : IJID : official publication of the International Society for Infectious Diseases, Elsevier BV, 2020-08-26) [Article]
    The SARS-CoV-2 pathogen has established endemicity in humans. This necessitates the development of rapid genetic surveillance methodologies to serve as an adjunct with existing comprehensive, albeit though slower, genome sequencing-driven approaches. A total of 21,789 complete genomes were downloaded from GISAID on May 28, 2020 for analyses. We have defined the major clades and subclades of circulating SARS-CoV-2 genomes. A rapid sequencing-based genotyping protocol was developed and tested on SARS-CoV-2-positive RNA samples by next-generation sequencing. We describe 11 major mutations which defined five major clades (G614, S84, V251, I378 and D392) of globally circulating viral populations. The clades can specifically identify using an 11-nucleotide genetic barcode. An analysis of amino acid variation in SARS-CoV-2 proteins provided evidence of substitution events in the viral proteins involved in both host entry and genome replication. Globally circulating SARS-CoV-2 genomes could be classified into 5 major clades based on mutational profiles defined by an 11-nucleotide barcode. We have successfully developed a multiplexed sequencing-based, rapid genotyping protocol for high-throughput classification of major clade types of SARS-CoV-2 in clinical samples. This barcoding strategy will be required to monitor decreases in genetic diversity as treatment and vaccine approaches become widely available.
  • Novel tumour suppressor roles for GZMA and RASGRP1 in Theileria annulata-transformed macrophages and human B-lymphoma cells

    Rchiad, ‍Zineb; Haidar, Malak; Ansari, Hifzur Rahman; Tajeri, Shahin; Mfarrej, Sara; Ben Rached, Fathia; Kaushik, Abhinav; Langsley, Gordon; Pain, Arnab (Cellular Microbiology, Wiley, 2020-08-24) [Article]
    Theileria annulata is a tick-transmitted apicomplexan parasite that infects and transforms bovine leukocytes into disseminating tumours that cause a disease called tropical theileriosis. Using comparative transcriptomics we identified genes transcriptionally perturbed during Theileria-induced leukocyte transformation. Dataset comparisons highlighted a small set of genes associated with Theileria-transformed leukocyte dissemination. The roles of Granzyme A (GZMA) and RAS guanyl-releasing protein 1 (RASGRP1) were verified by CRISPR/Cas9-mediated knockdown. Knocking down expression of GZMA and RASGRP1 in attenuated macrophages led to a regain in their dissemination in Rag2/γC mice confirming their role as dissemination suppressors in vivo. We further evaluated the roles of GZMA and RASGRP1 in human B-lymphomas by comparing the transcriptome of 934 human cancer cell lines to that of Theileria-transformed bovine host cells. We confirmed dampened dissemination potential of human B-lymphomas that overexpress GZMA and RASGRP1. Our results provide evidence that GZMA and RASGRP1 have a novel tumour suppressor function in both T. annulata-infected bovine host leukocytes and in human B-lymphomas.

View more