Recent Submissions

  • Cancer-associated mutations in the p85α N-terminal SH2 domain activate a spectrum of receptor tyrosine kinases.

    Li, Xinran; Lau, Amy Y T; Ng, Angel S N; Aldehaiman, Abdullah; Zhou, Yuan; Ng, Patrick K S; Arold, Stefan T.; Cheung, Lydia W T (Proceedings of the National Academy of Sciences of the United States of America, 2021-09-11) [Article]
    The phosphoinositide 3-kinase regulatory subunit p85α is a key regulator of kinase signaling and is frequently mutated in cancers. In the present study, we showed that in addition to weakening the inhibitory interaction between p85α and p110α, a group of driver mutations in the p85α N-terminal SH2 domain activated EGFR, HER2, HER3, c-Met, and IGF-1R in a p110α-independent manner. Cancer cells expressing these mutations exhibited the activation of p110α and the AKT pathway. Interestingly, the activation of EGFR, HER2, and c-Met was attributed to the ability of driver mutations to inhibit HER3 ubiquitination and degradation. The resulting increase in HER3 protein levels promoted its heterodimerization with EGFR, HER2, and c-Met, as well as the allosteric activation of these dimerized partners; however, HER3 silencing abolished this transactivation. Accordingly, inhibitors of either AKT or the HER family reduced the oncogenicity of driver mutations. The combination of these inhibitors resulted in marked synergy. Taken together, our findings provide mechanistic insights and suggest therapeutic strategies targeting a class of recurrent p85α mutations.
  • ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation

    Umarov, Ramzan; Li, Yu; Arakawa, Takahiro; Takizawa, Satoshi; Gao, Xin; Arner, Erik (PLOS Computational Biology, Public Library of Science (PLoS), 2021-09-07) [Article]
    Regulatory elements control gene expression through transcription initiation (promoters) and by enhancing transcription at distant regions (enhancers). Accurate identification of regulatory elements is fundamental for annotating genomes and understanding gene expression patterns. While there are many attempts to develop computational promoter and enhancer identification methods, reliable tools to analyze long genomic sequences are still lacking. Prediction methods often perform poorly on the genome-wide scale because the number of negatives is much higher than that in the training sets. To address this issue, we propose a dynamic negative set updating scheme with a two-model approach, using one model for scanning the genome and the other one for testing candidate positions. The developed method achieves good genome-level performance and maintains robust performance when applied to other vertebrate species, without re-training. Moreover, the unannotated predicted regulatory regions made on the human genome are enriched for disease-associated variants, suggesting them to be potentially true regulatory elements rather than false positives. We validated high scoring “false positive” predictions using reporter assay and all tested candidates were successfully validated, demonstrating the ability of our method to discover novel human regulatory regions.
  • The genome of the zoonotic malaria parasite Plasmodium simium reveals adaptions to host-switching

    Mourier, Tobias; de Alvarenga, Denise Anete Madureira; Kaushik, Abhinav; de Pina-Costa, Anielle; Douvropoulou, Olga; Guan, Qingtian; Guzmán-Vega, Francisco J.; Forrester, Sarah; de Abreu, Filipe Vieira Santos; Júnior, Cesare Bianco; Junior, Julio Cesar de Souza; Moreira, Silvia Bahadian; Hirano, Zelinda Maria Braga; Pissinatti, Alcides; Ferreira-da-Cruz, Maria de Fátima; de Oliveira, Ricardo Lourenço; Arold, Stefan T.; Jeffares, Daniel C.; Brasil, Patrícia; de Brito, Cristiana Ferreira Alves; Culleton, Richard; Daniel-Ribeiro, Cláudio Tadeu; Pain, Arnab (NCBI, 2021-09-05) [Bioproject, Dataset]
    Plasmodium simium, a parasite of various species of Platyrrhini monkeys whose range is restricted to the Atlantic Forest of Brazil, is genetically and morphologically similar to P. vivax. Based on this similarity, it appears likely that P. simium originated as a parasite of monkeys in Brazil following a host switch from humans carrying P. vivax. The 2015 outbreak of P. simium in the local human population raised questions about the degree of divergence that has occurred between P. vivax and P. simium, and whether adaptation to monkeys has led to the evolution of a parasite with clinical relevance to human health that differs from that of P. vivax. The degree and nature of adaptation to a non-human primate host and a sylvatic transmission cycle that has occurred in P. simium following its anthroponotic origin is of relevance to the understanding of how malaria parasites adapt to new hosts. It is also of interest to determine whether the current, human-infecting P. simium parasites have recently undergone changes at the genomic level that have allowed them to infect people in this region, as it has previously been suggested that P. simium has historically lacked the ability to infect man.
  • Sustainable and Eco-Friendly Coral Restoration through 3D Printing and Fabrication

    Albalawi, Hamed I.; Khan, Zainab N.; Valle-Pérez, Alexander U.; Kahin, Kowther M.; Hountondji, Maria; Alwazani, Hibatallah; Schmidt-Roach, Sebastian; Bilalis, Panagiotis; Aranda, Manuel; Duarte, Carlos M.; Hauser, Charlotte (ACS Sustainable Chemistry & Engineering, American Chemical Society (ACS), 2021-09-02) [Article]
    Coral reef degradation is a rising problem, driven by marine heatwaves, the spread of coral diseases, and human impact by overfishing and pollution. Our capacity to restore coral reefs lags behind in terms of scale, effectiveness, and cost-efficiency. While common restoration efforts rely on the formation of carbonate skeletons on structural frames for supported coral growth, this technique is a rate-limiting step in the growth of scleractinian corals. Reverse engineering and additive manufacturing technologies offer an innovative shift in approach from the use of concrete blocks and metal frames to sophisticated efforts that use scanned geometries of harvested corals to fabricate artificial coral skeletons for installation in coral gardens and reefs. Herein, we present an eco-friendly and sustainable approach for coral fabrication by merging three-dimensional (3D) scanning, 3D printing, and molding techniques. Our method, 3D CoraPrint, exploits the 3D printing technology to fabricate artificial natural-based coral skeletons, expediting the growth rate of live coral fragments and quickening the reef transplantation process while minimizing nursery costs. It allows for flexibility, customization, and fast return time with an enhanced level of accuracy, thus establishing an environmentally friendly, scalable model for coral fabrication to boost restorative efforts around the globe.
  • Machine Learning and Deep Learning Methods that use Omics Data for Metastasis Prediction

    Albaradei, Somayah; Thafar, Maha A.; Alsaedi, Asim; Van Neste, Christophe Marc; Gojobori, Takashi; Essack, Magbubah; Gao, Xin (Computational and Structural Biotechnology Journal, Elsevier BV, 2021-09) [Article]
    Knowing metastasis is the primary cause of cancer-related deaths, incentivized research directed towards unraveling the complex cellular processes that drive the metastasis. Advancement in technology and specifically the advent of high-throughput sequencing provides knowledge of such processes. This knowledge led to the development of therapeutic and clinical applications, and is now being used to predict the onset of metastasis to improve diagnostics and disease therapies. In this regard, predicting metastasis onset has also been explored using artificial intelligence approaches that are machine learning, and more recently, deep learning-based. This review summarizes the different machine learning and deep learning-based metastasis prediction methods developed to date. We also detail the different types of molecular data used to build the models and the critical signatures derived from the different methods. We further highlight the challenges associated with using machine learning and deep learning methods, and provide suggestions to improve the predictive performance of such methods.
  • Seasonal and annual changes in the microbial communities of Ofunato Bay, Japan, based on metagenomics.

    Kobiyama, Atsushi; Rashid, Jonaira; Reza, Md Shaheed; Ikeda, Yuri; Yamada, Yuichiro; Kudo, Toshiaki; Mizusawa, Nanami; Yanagisawa, Saki; Ikeda, Daisuke; Sato, Shigeru; Ogata, Takehiko; Ikeo, Kazuho; Kaga, Shinnosuke; Watanabe, Shiho; Naiki, Kimiaki; Kaga, Yoshimasa; Segawa, Satoshi; Tada, Yumiko; Musashi, Tatsuya; Mineta, Katsuhiko; Gojobori, Takashi; Watabe, Shugo (Scientific reports, 2021-08-27) [Article]
    Five years of datasets from 2015 to 2019 of whole genome shotgun sequencing for cells trapped on 0.2-µm filters of seawater collected monthly from Ofunato Bay, an enclosed bay in Japan, were analysed, which included the 2015 data that we had reported previously. Nucleotide sequences were determined for extracted DNA from three locations for both the upper (1 m) and deeper (8 or 10 m) depths. The biotic communities analysed at the domain level comprised bacteria, eukaryotes, archaea and viruses. The relative abundance of bacteria was over 60% in most months for the five years. The relative abundance of the SAR86 cluster was highest in the bacterial group, followed by Candidatus Pelagibacter and Planktomarina. The relative abundance of Ca. Pelagibacter showed no relationship with environmental factors, and those of SAR86 and Planktomarina showed positive correlations with salinity and dissolved oxygen, respectively. The bacterial community diversity showed seasonal changes, with high diversity around September and low diversity around January for all five years. Nonmetric multidimensional scaling analysis also revealed that the bacterial communities in the bay were grouped in a season-dependent manner and linked with environmental variables such as seawater temperature, salinity and dissolved oxygen.
  • Linking common human diseases to their phenotypes; development of a resource for human phenomics

    Kafkas, Senay; Althubaiti, Sara; Gkoutos, Georgios V.; Hoehndorf, Robert; Schofield, Paul N. (Journal of Biomedical Semantics, Springer Science and Business Media LLC, 2021-08-23) [Article]
    Abstract Background In recent years a large volume of clinical genomics data has become available due to rapid advances in sequencing technologies. Efficient exploitation of this genomics data requires linkage to patient phenotype profiles. Current resources providing disease-phenotype associations are not comprehensive, and they often do not have broad coverage of the disease terminologies, particularly ICD-10, which is still the primary terminology used in clinical settings. Methods We developed two approaches to gather disease-phenotype associations. First, we used a text mining method that utilizes semantic relations in phenotype ontologies, and applies statistical methods to extract associations between diseases in ICD-10 and phenotype ontology classes from the literature. Second, we developed a semi-automatic way to collect ICD-10–phenotype associations from existing resources containing known relationships. Results We generated four datasets. Two of them are independent datasets linking diseases to their phenotypes based on text mining and semi-automatic strategies. The remaining two datasets are generated from these datasets and cover a subset of ICD-10 classes of common diseases contained in UK Biobank. We extensively validated our text mined and semi-automatically curated datasets by: comparing them against an expert-curated validation dataset containing disease–phenotype associations, measuring their similarity to disease–phenotype associations found in public databases, and assessing how well they could be used to recover gene–disease associations using phenotype similarity. Conclusion We find that our text mining method can produce phenotype annotations of diseases that are correct but often too general to have significant information content, or too specific to accurately reflect the typical manifestations of the sporadic disease. On the other hand, the datasets generated from integrating multiple knowledgebases are more complete (i.e., cover more of the required phenotype annotations for a given disease). We make all data freely available at 10.5281/zenodo.4726713.
  • Molecular Basis for Environment Sensing by a Nucleoid-Structuring Bacterial Protein Filament

    Zhao, Xiaochuan; Remington, Jacob M.; Schneebeli, Severin T.; Arold, Stefan T.; Li, Jianing (The Journal of Physical Chemistry Letters, American Chemical Society (ACS), 2021-08-12) [Article]
    The histone-like nucleoid structuring (H-NS) protein controls the expression of hundreds of genes in Gram-positive bacteria through its capability to coat and condense DNA. This mechanism requires the formation of superhelical H-NS protein filaments that are sensitive to temperature and salinity, allowing H-NS to act as an environment sensor. We use multiscale modeling and simulations to obtain detailed insights into the mechanism of H-NS filament's sensitivity to environmental changes. Through the simulations of the superhelical H-NS filament, we reveal how different environments induce heterogeneity of H-NS monomers. Further, we observe that transient self-association within the H-NS filament creates temperature-inducible strain and might mildly oppose DNA binding. We also probe different H-NS-DNA complex architectures and show that complexation enhances the stability of both DNA and H-NS superhelices. Overall, our results provide unprecedented molecular insights into the environmental sensing and DNA interactions of a prototypical nucleoid-structuring bacterial protein filament.
  • Model predictive control paradigms for fish growth reference tracking in precision aquaculture

    Chahid, Abderrazak; Ndoye, Ibrahima; Majoris, John E.; Berumen, Michael L.; Laleg-Kirati, Taous-Meriem (Journal of Process Control, Elsevier BV, 2021-08-12) [Article]
    In precision aquaculture, the primary goal is to maximize biomass production while minimizing production costs. This objective can be achieved by optimizing factors that have a strong influence on fish growth, such as feeding rate, temperature, and dissolved oxygen. This paper provides a comparative study of four model predictive control (MPC) strategies for fish growth reference tracking under a representative bioenergetic growth model in precision aquaculture. We propose to evaluate four candidate MPC formulations for fish growth reference tracking based on the receding horizon. The first MPC formulation tracks a desired fish growth trajectory while penalizing the feed ration, temperature, and dissolved oxygen. The second MPC optimization strategy directly optimizes the feed conversion ratio (FCR), which is the ratio between food quantity and fish weight gain at each sampling time. This FCR-like optimization strategy minimizes the feed while maximizing the predicted growth state's deviation from the given reference growth trajectory. The third MPC formulation includes a tradeoff between the growth rate trajectory tracking, the dynamic energy, and food cost. The last MPC formulation aims to maximize the fish growth rate while minimizing the costs. Numerical simulations that integrate a realistic bioenergetic fish growth model of Nile tilapia (Oreochromis niloticus) are illustrated to examine the comparative performance of the four proposed optimal control strategies. A sensitivity analysis is conducted to study the robustness of these MPC strategies with respect to the effect of the prediction horizon, the regularization term, and the additive input disturbances to the process. The obtained results show great potential and flexibility to meet the fish farmers’ needs depending on the type of fish, selling price, culture duration, and feed cost.
  • Established and candidate transthyretin amyloidosis variants identified in the Saudi population by data mining

    Abouelhoda, Mohamed; Mohty, Dania; Alayary, Islam; Meyer, Brian F.; Arold, Stefan T.; Fadel, Bahaa M.; Monies, Dorota (Human Genomics, Springer Science and Business Media LLC, 2021-08-11) [Article]
    Background: Familial transthyretin (TTR) amyloidosis (ATTR) is an autosomal dominant disease with significant phenotypic heterogeneity. Its prevalence in Saudi Arabia has not previously been investigated. An existing exome variant database of Saudi individuals, sequenced to globally investigate rare diseases in the population, was mined for TTR variants and filtered for missense mutations resulting in single amino acid changes. A total of 13,906 Saudi exomes from unrelated individuals were analyzed blindly. Results: Three TTR variants known to be associated with ATTR amyloidosis were identified. Additionally, three novel TTR mutations were identified. Structural analysis of the three novel variants suggests that at least two could be amyloidogenic. The most common variant associated with amyloidosis was p.Val142Ile (allele frequency 0.001). Further investigation of these variants and their translation to clinical practice may help to diagnose, monitor, and manage patients with ATTR amyloidosis. Conclusion: Multiple TTR variants potentially associated with systemic ATTR amyloidosis were identified in the Saudi population. Early diagnosis and intervention, facilitated by familial genetic testing of patients with ATTR amyloidosis, may benefit in the management of this disease. Early diagnosis could be enhanced through inclusion of ATTR variants in existing population-based screening programs. Keywords: Transthyretin, Amyloidosis, Familial, Saudi population, Epidemiology
  • The COVID-19 epidemiology and monitoring ontology

    Queralt-Rosinach, Núria; Schofield, Paul N.; Hoehndorf, Robert; Weiland, Claus; Schultes, Erik Anthony; Bernabé, César Henrique; Roos, Marco (Center for Open Science, 2021-08-11) [Preprint]
    The novel COVID-19 infectious disease emerged and spread, causing high mortality and morbidity rates worldwide. In the OBO Foundry, there are more than one hundred ontologies to share and analyse large-scale datasets for biological and biomedical sciences. However, this pandemic revealed that we lack tools for an efficient and timely exchange of this epidemiological data which is necessary to assess the impact of disease outbreaks, the efficacy of mitigating interventions and to provide a rapid response. In this study we present our findings and contributions for the bio-ontologies community.
  • Expanding the mutational landscape and clinical phenotype of the YIF1B related brain disorder

    Salsench, Eva Medico; Maroofian, Reza; Deng, Ruizhi; Lanko, Kristina; Nikoncuk, Anita; Pérez, Belén; Sánchez-Lijarcio, Obdulia; Ibáñez-Mico, Salvador; Wojcik, Antonina; Vargas, Marcelo; Al-Sannaa, Nouriya Abbas; Girgis, Marian Y; Silveira, Tainá Regina Damaceno; Bauer, Peter; Schroeder, Audrey; Fong, Chin-To; Begtrup, Amber; Babaei, Meisam; Toosi, Mehran Beiraghi; Ashrafzadeh, Farah; Imannezhad, Shima; Doosti, Mohammad; Ahangari, Najmeh; Torbati, Paria Najarzadeh; Karimiani, Ehsan Ghayoor; Murphy, David; Cali, Elisa; Kaya, Ibrahim H; AlMuhaizea, Mohammad; Colak, Dilek; Cardona-Londoño, Kelly J; Arold, Stefan T.; Houlden, Henry; Bertoli-Avella, Aida; Kaya, Namik; Barakat, Tahsin Stefan (Brain, Oxford University Press (OUP), 2021-08-09) [Article]
    With great interest we read the article by Diaz and colleagues1 providing further evidence of a neurodevelopmental disorder caused by bi-allelic variants disrupting the function of YIF1B, by reporting a second patient cohort and a mouse model. We had earlier reported 6 individuals from 5 unrelated families, harboring bi-allelic protein truncating mutations in YIF1B, presenting with a progressive encephalopathy with various degrees of movement disorders, microcephaly and epilepsy2.
  • Evaluating Semantic Similarity Methods for Comparison of Text-derived Phenotype Profiles

    Slater, Luke T; Russell, Sophie; Makepeace, Silver; Carberry, Alexander; Karwath, Andreas; Williams, John A; Fanning, Hilary; Ball, Simon; Hoehndorf, Robert; Gkoutos, Georgios (Cold Spring Harbor Laboratory, 2021-08-09) [Preprint]
    Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction, by leveraging the wealth of background knowledge provided by biomedical ontologies. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or methods in the area. In this work, we develop a reproducible platform for benchmarking experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from text narrative associated with admissions in MIMIC-III. In doing this, we identify and interpret the performance of a large number of semantic similarity measures for this task, and provide a basis for further research on related tasks in the area.
  • MetaCancer: A Deep Learning-Based Pan-cancer Metastasis Prediction Model Developed using Multi-omics Data

    Albaradei, Somayah; Napolitano, Farancesco; Thafar, Maha A.; Gojobori, Takashi; Essack, Magbubah; Gao, Xin (Computational and Structural Biotechnology Journal, Elsevier BV, 2021-08-09) [Article]
    Predicting metastasis in the early stages means that clinicians have more time to adjust a treatment regimen to target the primary and metastasized cancer. In this regard, several computational approaches are being developed to identify metastasis early. However, most of the approaches focus on changes on one genomic level only, and they are not being developed from a pan-cancer perspective. Thus, we here present a deep learning (DL)–based model, MetaCancer, that differentiates pan-cancer metastasis status based on three heterogeneous data layers. In particular, we built the DL-based model using 400 patients' data that includes RNA sequencing (RNA-Seq), microRNA sequencing (microRNA-Seq), and DNA methylation data from The Cancer Genome Atlas (TCGA). We quantitatively assess the proposed convolutional variational autoencoder (CVAE) and alternative feature extraction methods. We further show that integrating mRNA, microRNA, and DNA methylation data as features improves our model's performance compared to when we used mRNA data only. In addition, we show that the mRNA-related features make a more significant contribution when attempting to distinguish the primary tumors from metastatic ones computationally. Lastly, we show that our DL model significantly outperformed a machine learning (ML) ensemble method based on various metrics.
  • Delineating the molecular and phenotypic spectrum of the SETD1B-related syndrome.

    Weerts, Marjolein J A; Lanko, Kristina; Guzmán-Vega, Francisco J.; Jackson, Adam; Ramakrishnan, Reshmi; Cardona-Londoño, Kelly J; Peña-Guerra, Karla A; van Bever, Yolande; van Paassen, Barbara W; Kievit, Anneke; van Slegtenhorst, Marjon; Allen, Nicholas M; Kehoe, Caroline M; Robinson, Hannah K; Pang, Lewis; Banu, Selina H; Zaman, Mashaya; Efthymiou, Stephanie; Houlden, Henry; Järvelä, Irma; Lauronen, Leena; Määttä, Tuomo; Schrauwen, Isabelle; Leal, Suzanne M; Ruivenkamp, Claudia A L; Barge-Schaapveld, Daniela Q C M; Peeters-Scholte, Cacha M P C D; Galehdari, Hamid; Mazaheri, Neda; Sisodiya, Sanjay M; Harrison, Victoria; Sun, Angela; Thies, Jenny; Pedroza, Luis Alberto; Lara-Taranchenko, Yana; Chinn, Ivan K; Lupski, James R; Garza-Flores, Alexandra; McGlothlin, Jeffery; Yang, Lin; Huang, Shaoping; Wang, Xiaodong; Jewett, Tamison; Rosso, Gretchen; Lin, Xi; Mohammed, Shehla; Merritt, J Lawrence; Mirzaa, Ghayda M; Timms, Andrew E; Scheck, Joshua; Elting, Mariet W; Polstra, Abeltje M; Schenck, Lauren; Ruzhnikov, Maura R Z; Vetro, Annalisa; Montomoli, Martino; Guerrini, Renzo; Koboldt, Daniel C; Mosher, Theresa Mihalic; Pastore, Matthew T; McBride, Kim L; Peng, Jing; Pan, Zou; Willemsen, Marjolein; Koning, Susanne; Turnpenny, Peter D; de Vries, Bert B A; Gilissen, Christian; Pfundt, Rolph; Lees, Melissa; Braddock, Stephen R; Klemp, Kara C; Vansenne, Fleur; van Gijn, Marielle E; Quindipan, Catherine; Deardorff, Matthew A; Hamm, J Austin; Putnam, Abbey M; Baud, Rebecca; Walsh, Laurence; Lynch, Sally A; Baptista, Julia; Person, Richard E; Monaghan, Kristin G; Crunk, Amy; Keller-Ramey, Jennifer; Reich, Adi; Elloumi, Houda Zghal; Alders, Marielle; Kerkhof, Jennifer; McConkey, Haley; Haghshenas, Sadegheh; Genomics England Research Consortium; Maroofian, Reza; Sadikovic, Bekim; Banka, Siddharth; Barakat, Tahsin Stefan; Barakat, Tahsin Stefan (Genetics in medicine : official journal of the American College of Medical Genetics, Springer Science and Business Media LLC, 2021-08-04) [Article]
    PurposePathogenic variants in SETD1B have been associated with a syndromic neurodevelopmental disorder including intellectual disability, language delay, and seizures. To date, clinical features have been described for 11 patients with (likely) pathogenic SETD1B sequence variants. This study aims to further delineate the spectrum of the SETD1B-related syndrome based on characterizing an expanded patient cohort.MethodsWe perform an in-depth clinical characterization of a cohort of 36 unpublished individuals with SETD1B sequence variants, describing their molecular and phenotypic spectrum. Selected variants were functionally tested using in vitro and genome-wide methylation assays.ResultsOur data present evidence for a loss-of-function mechanism of SETD1B variants, resulting in a core clinical phenotype of global developmental delay, language delay including regression, intellectual disability, autism and other behavioral issues, and variable epilepsy phenotypes. Developmental delay appeared to precede seizure onset, suggesting SETD1B dysfunction impacts physiological neurodevelopment even in the absence of epileptic activity. Males are significantly overrepresented and more severely affected, and we speculate that sex-linked traits could affect susceptibility to penetrance and the clinical spectrum of SETD1B variants.ConclusionInsights from this extensive cohort will facilitate the counseling regarding the molecular and phenotypic landscape of newly diagnosed patients with the SETD1B-related syndrome.
  • DTI-Voodoo: machine learning over interaction networks and ontology-based background knowledge predicts drug–target interactions

    Hinnerichs, Tilman; Hoehndorf, Robert (Bioinformatics, Oxford University Press (OUP), 2021-07-28) [Article]
    Motivation In silico drug–target interaction (DTI) prediction is important for drug discovery and drug repurposing. Approaches to predict DTIs can proceed indirectly, top-down, using phenotypic effects of drugs to identify potential drug targets, or they can be direct, bottom-up and use molecular information to directly predict binding affinities. Both approaches can be combined with information about interaction networks. Results We developed DTI-Voodoo as a computational method that combines molecular features and ontology-encoded phenotypic effects of drugs with protein–protein interaction networks, and uses a graph convolutional neural network to predict DTIs. We demonstrate that drug effect features can exploit information in the interaction network whereas molecular features do not. DTI-Voodoo is designed to predict candidate drugs for a given protein; we use this formulation to show that common DTI datasets contain intrinsic biases with major effects on performance evaluation and comparison of DTI prediction methods. Using a modified evaluation scheme, we demonstrate that DTI-Voodoo improves significantly over state of the art DTI prediction methods.
  • A Multilayer Perceptron-based Carotid-to-Femoral Pulse Wave Velocity Estimation using PPG Signal

    Bahloul, Mohamed; Chahid, Abderrazak; Laleg-Kirati, Taous-Meriem (IEEE, 2021-07-27) [Conference Paper]
    Cardiovascular diseases (CVDs) are the primary cause of death in the world. The development of easy-to-use and non-invasive monitoring CVDs’ diagnosis methods is crucial. Among the key parameters in the cardiovascular system is arterial stiffness. An increase in arterial stiffness is considered a primary risk factor for CVDs. Although arterial stiffness can be assessed non-invasively by measuring the carotid-to-femoral pulse wave velocity (cf−PWV), which is considered as a gold standard for arterial stiffness measurement, the clinical process of assessing this parameter is very intrusive and complicated. This paper investigated the potential of estimating (cf−PWV) from distal photoplethysmogram (PPG) waveforms using regression technique based on a multilayer perceptron. Functionally, PPG offers a simple, reliable, low-cost technique to measure blood volume change and hence assess cardiovascular function. In this work, we identify and select features from the timing fiducial points-based PPG, its first, second, and third derivative waveforms. The in-silico validation shows promising results and satisfactory accuracy. It demonstrates good estimation performances with an R<sup>2</sup> (correlation coefficient) around 0.95 and MAPE (mean absolute percentage error) less than 2.22% based on features extracted from PPG at the brachial artery level, an R<sup>2</sup> around 0.98 and MAPE less than 1.71% based on features extracted from PPG at the radial artery level and R<sup>2</sup> around 0.97 and MAPE less than 1.88% based on features extracted from PPG at the digital artery level.
  • Genes and comorbidities of thyroid cancer

    Ljubic, Branimir; Pavlovski, Martin; Roychoudhury, Shoumik; Van Neste, Christophe Marc; Salhi, Adil; Essack, Magbubah; Bajic, Vladimir B.; Obradovic, Zoran (Informatics in Medicine Unlocked, Elsevier BV, 2021-07-26) [Article]
    Introduction: Thyroid cancer represents 3.1 % of diagnosed cancers in the United States. The objective of this research was to identify comorbidities and discover additional genes potentially related to thyroid cancer and improve current knowledge of genetics and comorbidities associated with this cancer. Methods: Healthcare Cost and Utilization Project (HCUP) California State Inpatient Database (SID) was used to extract and rank the comorbidities of thyroid cancer. The text mining software - BeFree was utilized to identify and extract genes associated with thyroid cancer and the comorbidities from PubMed abstracts and the DisGeNET expert-curated repositories. Results: Female patients had 4,485, and male patients 2912 different comorbidities in early stages of thyroid cancer. Females had 3,587, and males 2817 different comorbidities in advanced stages. Through PubMed utilizing the BeFree method, 504 different genes associated with thyroid cancer were discovered, as well as five genes on DisGeNET. The most often genes on PubMed, associated with thyroid cancer were: BRAF, RET, SLC5A5, RAS, and PTEN. Genes found via DisGeNET were BRAF, RET, KRAS, NRAS, and PRKAR1A. Conclusion: Identified genes and comorbidities, as potential additional risk factors for thyroid cancer, not previously known, could improve the early diagnosis and the survival of patients with thyroid cancer. Genes discovered in this research in association with thyroid cancer could be used to direct decision making for optimal, more personalized treatment of thyroid cancer.
  • Protein-RNA interaction prediction with deep learning: Structure matters

    Wei, Junkang; Chen, Siyuan; Zong, Licheng; Gao, Xin; Li, Yu (arXiv, 2021-07-26) [Preprint]
    Protein-RNA interactions are of vital importance to a variety of cellular activities. Both experimental and computational techniques have been developed to study the interactions. Due to the limitation of the previous database, especially the lack of protein structure data, most of the existing computational methods rely heavily on the sequence data, with only a small portion of the methods utilizing the structural information. Recently, AlphaFold has revolutionized the entire protein and biology field. Foreseeably, the protein-RNA interaction prediction will also be promoted significantly in the upcoming years. In this work, we give a thorough review of this field, surveying both the binding site and binding preference prediction problems and covering the commonly used datasets, features, and models. We also point out the potential challenges and opportunities in this field. This survey summarizes the development of the RBP-RNA interaction field in the past and foresees its future development in the post-AlphaFold era.
  • Re-infection with different SARS-CoV-2 clade and prolonged viral shedding in a patient with hematopoietic stem cell transplantation: SARS-CoV-2 Re-infection with different clade.

    Alshukairi, Abeer N; El-Kafrawy, Sherif A; Dada, Ashraf; Yasir, Mohamed; Yamani, Amani H; Saeedi, Mohammed F; Aljohaney, Ahmed; AlJohani, Naif I; Bahaudden, Husam A; Alam, Intikhab; Gojobori, Takashi; Radovanovic, Aleksandar; Alandijany, Thamir A; Othman, Norah A; Alsubhi, Tagreed L; Hassan, Ahmed M; Tolah, Ahmed M; Al-Tawfiq, Jaffar A; Zumla, Alimuddin; Azhar, Esam I (International journal of infectious diseases : IJID : official publication of the International Society for Infectious Diseases, Elsevier BV, 2021-07-21) [Article]
    Immunocompromised patients who have SARS-CoV-2 infection pose many clinical and public health challenges. We describe a patient with hematopoietic stem cell transplantation and lymphoma with protracted illness requiring 3 consecutive hospital admissions. Whole genome sequencing confirmed two different SARS-CoV-2 clades. Clinical management issues, and unanswered questions arising are discussed.

View more