Show simple item record

dc.contributor.authorSlater, Luke T
dc.contributor.authorBradlow, William
dc.contributor.authorBall, Simon
dc.contributor.authorHoehndorf, Robert
dc.contributor.authorGkoutos, Georgios V
dc.date.accessioned2021-04-15T06:46:17Z
dc.date.available2020-07-27T11:51:30Z
dc.date.available2021-04-15T06:46:17Z
dc.date.issued2021-04-13
dc.date.submitted2020-07-17
dc.identifier.citationSlater, L. T., Bradlow, W., Ball, S., Hoehndorf, R., & Gkoutos, G. V. (2021). Improved characterisation of clinical text through ontology-based vocabulary expansion. Journal of Biomedical Semantics, 12(1). doi:10.1186/s13326-021-00241-5
dc.identifier.issn2041-1480
dc.identifier.pmid33845909
dc.identifier.doi10.1186/s13326-021-00241-5
dc.identifier.urihttp://hdl.handle.net/10754/664429
dc.description.abstractBackgroundBiomedical ontologies contain a wealth of metadata that constitutes a fundamental infrastructural resource for text mining. For several reasons, redundancies exist in the ontology ecosystem, which lead to the same entities being described by several concepts in the same or similar contexts across several ontologies. While these concepts describe the same entities, they contain different sets of complementary metadata. Linking these definitions to make use of their combined metadata could lead to improved performance in ontology-based information retrieval, extraction, and analysis tasks.ResultsWe develop and present an algorithm that expands the set of labels associated with an ontology class using a combination of strict lexical matching and cross-ontology reasoner-enabled equivalency queries. Across all disease terms in the Disease Ontology, the approach found 51,362 additional labels, more than tripling the number defined by the ontology itself. Manual validation by a clinical expert on a random sampling of expanded synonyms over the Human Phenotype Ontology yielded a precision of 0.912. Furthermore, we found that annotating patient visits in MIMIC-III with an extended set of Disease Ontology labels led to semantic similarity score derived from those labels being a significantly better predictor of matching first diagnosis, with a mean average precision of 0.88 for the unexpanded set of annotations, and 0.913 for the expanded set.ConclusionsInter-ontology synonym expansion can lead to a vast increase in the scale of vocabulary available for text mining applications. While the accuracy of the extended vocabulary is not perfect, it nevertheless led to a significantly improved ontology-based characterisation of patients from text in one setting. Furthermore, where run-on error is not acceptable, the technique can be used to provide candidate synonyms which can be checked by a domain expert.
dc.description.sponsorshipThe authors would like to acknowledge Dr Andreas Karwath for advice on evaluating ranking algorithms. We would further like to thank Dr Paul Schofield and Dr Egon Willighagen for advice concerning an earlier version of the experiment, particularly surrounding precision and error. We would also like to thank Syed Ali Raza for work on the AberOWL platform, and the creators of MIMIC-III for making their data available for public use.
dc.description.sponsorshipGVG and LTS acknowledge support from support from the NIHR Birmingham ECMC, the NIHR Birmingham SRMRC, Nanocommons H2020-EU (731032), OpenRisknet H2020-EINFRA (731075) and the NIHR Birmingham Biomedical Research Centre and the MRC HDR UK (HDRUK/CFC/01), an initiative funded by UK Research and Innovation, Department of Health and Social Care (England) and the devolved administrations, and leading medical research charities. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research, the Medical Research Council or the Department of Health. RH and GVG were supported by funding from King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. URF/1/3790-01-01.
dc.publisherSpringer Nature
dc.relation.urlhttps://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-021-00241-5
dc.rightsThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.titleImproved characterisation of clinical text through ontology-based vocabulary expansion.
dc.typeArticle
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputational Bioscience Research Center (CBRC)
dc.identifier.journalJournal of biomedical semantics
dc.eprint.versionPublisher's Version/PDF
dc.contributor.institutionInstitute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham.
dc.contributor.institutionInstitute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust.
dc.contributor.institutionNIHR Experimental Cancer Medicine Centre.
dc.contributor.institutionNIHR Surgical Reconstruction and Microbiology Research Centre.
dc.contributor.institutionNIHR Biomedical Research Centre.
dc.contributor.institutionMRC Health Data Research UK (HDR UK) Midlands.
dc.identifier.volume12
dc.identifier.issue1
kaust.personHoehndorf, Robert
kaust.grant.numberURF/1/3790-01-01
dc.date.accepted2021-03-18
refterms.dateFOA2020-07-27T11:52:02Z
kaust.acknowledged.supportUnitOSR


Files in this item

Thumbnail
Name:
Improved characterization.pdf
Size:
580.0Kb
Format:
PDF
Description:
Publisher's version

This item appears in the following Collection(s)

Show simple item record

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Except where otherwise noted, this item's license is described as This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
VersionItemEditorDateSummary

*Selected version