Show simple item record

dc.contributor.authorLü, Yongchun
dc.contributor.authorZeng, Xiangrui
dc.contributor.authorTian, Xinhui
dc.contributor.authorShi, Xiao
dc.contributor.authorWang, Hui
dc.contributor.authorZheng, Xiaohui
dc.contributor.authorLiu, Xiaodong
dc.contributor.authorZhao, Xiaofang
dc.contributor.authorGao, Xin
dc.contributor.authorXu, Min
dc.date.accessioned2020-09-28T12:23:26Z
dc.date.available2020-09-28T12:23:26Z
dc.date.issued2020-09-17
dc.identifier.citationLü, Y., Zeng, X., Tian, X., Shi, X., Wang, H., Zheng, X., … Xu, M. (2020). Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation. BMC Bioinformatics, 21(S13). doi:10.1186/s12859-020-03680-6
dc.identifier.issn1471-2105
dc.identifier.pmid32938398
dc.identifier.doi10.1186/s12859-020-03680-6
dc.identifier.urihttp://hdl.handle.net/10754/665343
dc.description.abstractAbstract Background Resolution estimation is the main evaluation criteria for the reconstruction of macromolecular 3D structure in the field of cryoelectron microscopy (cryo-EM). At present, there are many methods to evaluate the 3D resolution for reconstructed macromolecular structures from Single Particle Analysis (SPA) in cryo-EM and subtomogram averaging (SA) in electron cryotomography (cryo-ET). As global methods, they measure the resolution of the structure as a whole, but they are inaccurate in detecting subtle local changes of reconstruction. In order to detect the subtle changes of reconstruction of SPA and SA, a few local resolution methods are proposed. The mainstream local resolution evaluation methods are based on local Fourier shell correlation (FSC), which is computationally intensive. However, the existing resolution evaluation methods are based on multi-threading implementation on a single computer with very poor scalability. Results This paper proposes a new fine-grained 3D array partition method by key-value format in Spark. Our method first converts 3D images to key-value data (K-V). Then the K-V data is used for 3D array partitioning and data exchange in parallel. So Spark-based distributed parallel computing framework can solve the above scalability problem. In this distributed computing framework, all 3D local FSC tasks are simultaneously calculated across multiple nodes in a computer cluster. Through the calculation of experimental data, 3D local resolution evaluation algorithm based on Spark fine-grained 3D array partition has a magnitude change in computing speed compared with the mainstream FSC algorithm under the condition that the accuracy remains unchanged, and has better fault tolerance and scalability. Conclusions In this paper, we proposed a K-V format based fine-grained 3D array partition method in Spark to parallel calculating 3D FSC for getting a 3D local resolution density map. 3D local resolution density map evaluates the three-dimensional density maps reconstructed from single particle analysis and subtomogram averaging. Our proposed method can significantly increase the speed of the 3D local resolution evaluation, which is important for the efficient detection of subtle variations among reconstructed macromolecular structures.
dc.description.sponsorshipWe thank Dr. Fei Sun for fruitful discussions and suggestions. We thank Dr. Haiyang Li and Mr. Shuguang Zhao for technical assistance. We thank Wenlin Liu and Chengrui Wang for code discussions. We thank Dr. Yan Jin and Alex Singh for revising the English writing.
dc.publisherSpringer Nature
dc.relation.urlhttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03680-6
dc.rightsThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.titleSpark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation
dc.typeArticle
dc.contributor.departmentComputational Bioscience Research Center (CBRC)
dc.contributor.departmentComputer Science Program
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.contributor.departmentStructural and Functional Bioinformatics Group
dc.identifier.journalBMC Bioinformatics
dc.eprint.versionPublisher's Version/PDF
dc.contributor.institutionInstitute of Computing Technology of the Chinese Academy of Sciences, Beijing, China
dc.contributor.institutionUniversity of Chinese Academy of Sciences, Beijing, China
dc.contributor.institutionComputational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, United States
dc.identifier.volume21
dc.identifier.issueS13
kaust.personGao, Xin
dc.identifier.eid2-s2.0-85091127533
refterms.dateFOA2020-09-28T12:24:31Z
dc.date.published-online2020-09-17
dc.date.published-print2020-09


Files in this item

Thumbnail
Name:
spark based.pdf
Size:
3.480Mb
Format:
PDF
Description:
Published version

This item appears in the following Collection(s)

Show simple item record

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Except where otherwise noted, this item's license is described as This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.