For more information visit:

Recent Submissions

  • Soliton-based single-point pulse wave velocity model: A quantum mechanical approach

    Piliouras, Evangelos; Laleg-Kirati, Taous-Meriem (Biomedical Signal Processing and Control, Elsevier BV, 2021-10-20) [Article]
    Cardiovascular diseases (CVDs) are one of the strongest contributors to mortality rates worldwide. To assess the severity of a clinical situation, various indices of CVD risk have been established, one of them being the arterial stiffness. Arterial stiffness is the quantification of the arterial elasticity. There exist several methodologies to assess the level of arterial stiffness where their non-invasiveness is a matter of great importance. The pulse wave velocity (PWV) is used as an indicator of the arterial stiffness and satisfies the non-invasiveness requirement. Specifically, the carotid-femoral PWV-based method is considered one of the most trustworthy methodology in quantifying the arterial stiffness. This paper proposes a new model for the PWV along with insights on a real scenario implementation. The model utilizes Semi-classical signal analysis (SCSA) as the main signal processing framework to analyze the blood pressure waveform. The proposed model is suggested to be used as an add-on to existing methodologies, bringing the feature of single-point measurement, once a calibration phase has preceded. The use of such a model can eliminate the pulse propagation time-delay, one of the dominant sources of PWV error. Additionally, the single-point measurement paves the way of prolonged PWV monitoring that can reveal new clinical features of the PWV. The model was validated both in a theoretical and data basis, validating its predicted hyperbolic PWV behavior with respect to the SCSA parameters.
  • NERO: a biomedical named-entity (recognition) ontology with a large, annotated corpus reveals meaningful associations through text embedding

    Wang, Kanix; Stevens, Robert; Alachram, Halima; Li, Yu; Soldatova, Larisa; King, Ross; Ananiadou, Sophia; Schoene, Annika M.; Li, Maolin; Christopoulou, Fenia; Ambite, José Luis; Matthew, Joel; Garg, Sahil; Hermjakob, Ulf; Marcu, Daniel; Sheng, Emily; Beißbarth, Tim; Wingender, Edgar; Galstyan, Aram; Gao, Xin; Chambers, Brendan; Pan, Weidi; Khomtchouk, Bohdan B.; Evans, James A.; Rzhetsky, Andrey (npj Systems Biology and Applications, Springer Science and Business Media LLC, 2021-10-20) [Article]
    Machine reading (MR) is essential for unlocking valuable knowledge contained in millions of existing biomedical documents. Over the last two decades1,2, the most dramatic advances in MR have followed in the wake of critical corpus development3. Large, well-annotated corpora have been associated with punctuated advances in MR methodology and automated knowledge extraction systems in the same way that ImageNet4 was fundamental for developing machine vision techniques. This study contributes six components to an advanced, named entity analysis tool for biomedicine: (a) a new, Named Entity Recognition Ontology (NERO) developed specifically for describing textual entities in biomedical texts, which accounts for diverse levels of ambiguity, bridging the scientific sublanguages of molecular biology, genetics, biochemistry, and medicine; (b) detailed guidelines for human experts annotating hundreds of named entity classes; (c) pictographs for all named entities, to simplify the burden of annotation for curators; (d) an original, annotated corpus comprising 35,865 sentences, which encapsulate 190,679 named entities and 43,438 events connecting two or more entities; (e) validated, off-the-shelf, named entity recognition (NER) automated extraction, and; (f) embedding models that demonstrate the promise of biomedical associations embedded within this corpus.
  • LineFS

    Kim, Jongyul; Jang, Insu; Reda, Waleed; Im, Jaeseong; Canini, Marco; Kostić, Dejan; Kwon, Youngjin; Peter, Simon; Witchel, Emmett (ACM, 2021-10-19) [Conference Paper]
    In multi-tenant systems, the CPU overhead of distributed file systems (DFSes) is increasingly a burden to application performance. CPU and memory interference cause degraded and unstable application and storage performance, in particular for operation latency. Recent client-local DFSes for persistent memory (PM) accelerate this trend. DFS offload to SmartNICs is a promising solution to these problems, but it is challenging to fit the complex demands of a DFS onto simple SmartNIC processors located across PCIe. We present LineFS, a SmartNIC-offloaded, high-performance DFS with support for client-local PM. To fully leverage the SmartNIC architecture, we decompose DFS operations into execution stages that can be offloaded to a parallel datapath execution pipeline on the SmartNIC. LineFS offloads CPU-intensive DFS tasks, like replication, compression, data publication, index and consistency management to a Smart-NIC. We implement LineFS on the Mellanox BlueField Smart-NIC and compare it to Assise, a state-of-the-art PM DFS. LineFS improves latency in LevelDB up to 80% and throughput in Filebench up to 79%, while providing extended DFS availability during host system failures.
  • Adaptive Tikhonov strategies for stochastic ensemble Kalman inversion

    Weissmann, Simon; Chada, Neil Kumar; Schillings, Claudia; Tong, Xin T. (arXiv, 2021-10-18) [Preprint]
    Ensemble Kalman inversion (EKI) is a derivative-free optimizer aimed at solving inverse problems, taking motivation from the celebrated ensemble Kalman filter. The purpose of this article is to consider the introduction of adaptive Tikhonov strategies for EKI. This work builds upon Tikhonov EKI (TEKI) which was proposed for a fixed regularization constant. By adaptively learning the regularization parameter, this procedure is known to improve the recovery of the underlying unknown. For the analysis, we consider a continuous-time setting where we extend known results such as well-posdeness and convergence of various loss functions, but with the addition of noisy observations. Furthermore, we allow a time-varying noise and regularization covariance in our presented convergence result which mimic adaptive regularization schemes. In turn we present three adaptive regularization schemes, which are highlighted from both the deterministic and Bayesian approaches for inverse problems, which include bilevel optimization, the MAP formulation and covariance learning. We numerically test these schemes and the theory on linear and nonlinear partial differential equations, where they outperform the non-adaptive TEKI and EKI.
  • Terahertz emission mediated by ultrafast time-varying metasurfaces

    Tunesi, J.; Peters, L.; Gongora, J. S. Totero; Olivieri, L.; Fratalocchi, Andrea; Pasquazi, A.; Peccianti, M. (Physical Review Research, American Physical Society (APS), 2021-10-18) [Article]
    Systems with ultrafast time-varying dielectric properties represent an emerging physical framework. We demonstrate here the observation of subcycle dynamics interacting directly with an electromagnetic source comprised of morphologically constrained photoexcited carriers in a surface nanostructure. A transition to a metallic metasurface state occurs on time scales faster than the terahertz-field period, inducing large nonlinear ultrafast phase shifts in the terahertz emission and exposing an interesting physical setting.
  • Quasi-Newton methods for machine learning: forget the past, just sample

    Berahas, A. S.; Jahani, M.; Richtarik, Peter; Takáč, M. (Optimization Methods and Software, Informa UK Limited, 2021-10-15) [Article]
    We present two sampled quasi-Newton methods (sampled LBFGS and sampled LSR1) for solving empirical risk minimization problems that arise in machine learning. Contrary to the classical variants of these methods that sequentially build Hessian or inverse Hessian approximations as the optimization progresses, our proposed methods sample points randomly around the current iterate at every iteration to produce these approximations. As a result, the approximations constructed make use of more reliable (recent and local) information and do not depend on past iterate information that could be significantly stale. Our proposed algorithms are efficient in terms of accessed data points (epochs) and have enough concurrency to take advantage of parallel/distributed computing environments. We provide convergence guarantees for our proposed methods. Numerical tests on a toy classification problem as well as on popular benchmarking binary classification and neural network training tasks reveal that the methods outperform their classical variants.
  • Ensemble Kalman filtering with colored observation noise

    Raboudi, Naila Mohammed Fathi; Ait-El-Fquih, Boujemaa; Ombao, Hernando; Hoteit, Ibrahim (Quarterly Journal of the Royal Meteorological Society, Wiley, 2021-10-15) [Article]
    The Kalman filter (KF) is derived under the assumption of time-independent (white) observation noise. Although this assumption can be reasonable in many ocean and atmospheric applications, the recent increase in sensors coverage such as the launching of new constellations of satellites with global spatio-temporal coverage will provide high density of oceanic and atmospheric observations that are expected to have time-dependent (colored) error statistics. In this situation, the KF update has been shown to generally provide overconfident probability estimates, which may degrade the filter performance. Different KF-based schemes accounting for time-correlated observation noise were proposed for small systems by modeling the colored noise as a first-order autoregressive model driven by white Gaussian noise. This work introduces new ensemble Kalman filters (EnKFs) that account for colored observational noises for efficient data assimilation into large-scale oceanic and atmospheric applications. More specifically, we follow the standard and the one-step-ahead smoothing formulations of the Bayesian filtering problem with colored observational noise, modeled as an autoregressive model, to derive two (deterministic) EnKFs. We demonstrate the relevance of the colored observational noise-aware EnKFs and analyze their performances through extensive numerical experiments conducted with the Lorenz-96 model.
  • Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks

    Zhu, Peihao; Abdal, Rameen; Femiani, John; Wonka, Peter (arXiv, 2021-10-15) [Preprint]
    We present a new method for one shot domain adaptation. The input to our method is trained GAN that can produce images in domain A and a single reference image I_B from domain B. The proposed algorithm can translate any output of the trained GAN from domain A to domain B. There are two main advantages of our method compared to the current state of the art: First, our solution achieves higher visual quality, e.g. by noticeably reducing overfitting. Second, our solution allows for more degrees of freedom to control the domain gap, i.e. what aspects of image I_B are used to define the domain B. Technically, we realize the new method by building on a pre-trained StyleGAN generator as GAN and a pre-trained CLIP model for representing the domain gap. We propose several new regularizers for controlling the domain gap to optimize the weights of the pre-trained StyleGAN generator to output images in domain B instead of domain A. The regularizers prevent the optimization from taking on too many attributes of the single reference image. Our results show significant visual improvements over the state of the art as well as multiple applications that highlight improved control.
  • Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks

    Zhu, Peihao; Abdal, Rameen; Femiani, John; Wonka, Peter (arXiv, 2021-10-15) [Preprint]
    We present a new method for one shot domain adaptation. The input to our method is trained GAN that can produce images in domain A and a single reference image I_B from domain B. The proposed algorithm can translate any output of the trained GAN from domain A to domain B. There are two main advantages of our method compared to the current state of the art: First, our solution achieves higher visual quality, e.g. by noticeably reducing overfitting. Second, our solution allows for more degrees of freedom to control the domain gap, i.e. what aspects of image I_B are used to define the domain B. Technically, we realize the new method by building on a pre-trained StyleGAN generator as GAN and a pre-trained CLIP model for representing the domain gap. We propose several new regularizers for controlling the domain gap to optimize the weights of the pre-trained StyleGAN generator to output images in domain B instead of domain A. The regularizers prevent the optimization from taking on too many attributes of the single reference image. Our results show significant visual improvements over the state of the art as well as multiple applications that highlight improved control.
  • On Seven Fundamental Optimization Challenges in Machine Learning

    Mishchenko, Konstantin (2021-10-14) [Dissertation]
    Advisor: Richtarik, Peter
    Committee members: Carin, Lawrence; Ghanem, Bernard; Sra, Suvrit; Yin, Wotao
    Many recent successes of machine learning went hand in hand with advances in optimization. The exchange of ideas between these fields has worked both ways, with ' learning building on standard optimization procedures such as gradient descent, as well as with new directions in the optimization theory stemming from machine learning applications. In this thesis, we discuss new developments in optimization inspired by the needs and practice of machine learning, federated learning, and data science. In particular, we consider seven key challenges of mathematical optimization that are relevant to modern machine learning applications, and develop a solution to each. Our first contribution is the resolution of a key open problem in Federated Learning: we establish the first theoretical guarantees for the famous Local SGD algorithm in the crucially important heterogeneous data regime. As the second challenge, we close the gap between the upper and lower bounds for the theory of two incremental algorithms known as Random Reshuffling (RR) and Shuffle-Once that are widely used in practice, and in fact set as the default data selection strategies for SGD in modern machine learning software. Our third contribution can be seen as a combination of our new theory for proximal RR and Local SGD yielding a new algorithm, which we call FedRR. Unlike Local SGD, FedRR is the first local first-order method that can provably beat gradient descent in communication complexity in the heterogeneous data regime. The fourth challenge is related to the class of adaptive methods. In particular, we present the first parameter-free stepsize rule for gradient descent that provably works for any locally smooth convex objective. The fifth challenge we resolve in the affirmative is the development of an algorithm for distributed optimization with quantized updates that preserves global linear convergence of gradient descent. Finally, in our sixth and seventh challenges, we develop new VR mechanisms applicable to the non-smooth setting based on proximal operators and matrix splitting. In all cases, our theory is simpler, tighter and uses fewer assumptions than the prior literature. We accompany each chapter with numerical experiments to show the tightness of the proposed theoretical results.
  • Lethal variants in humans: lessons learned from a large molecular autopsy cohort.

    Shamseldin, Hanan E; AlAbdi, Lama; Maddirevula, Sateesh; Alsaif, Hessa S; AlZahrani, Fatema; Ewida, Nour; Hashem, Mais; Abdulwahab, Firdous; Abuyousef, Omar; Kuwahara, Hiroyuki; Gao, Xin; Molecular Autopsy Consortium; Alkuraya, Fowzan S (Genome medicine, Springer Science and Business Media LLC, 2021-10-14) [Article]
    BackgroundMolecular autopsy refers to DNA-based identification of the cause of death. Despite recent attempts to broaden its scope, the term remains typically reserved to sudden unexplained death in young adults. In this study, we aim to showcase the utility of molecular autopsy in defining lethal variants in humans.MethodsWe describe our experience with a cohort of 481 cases in whom the cause of premature death was investigated using DNA from the index or relatives (molecular autopsy by proxy). Molecular autopsy tool was typically exome sequencing although some were investigated using targeted approaches in the earlier stages of the study; these include positional mapping, targeted gene sequencing, chromosomal microarray, and gene panels.ResultsThe study includes 449 cases from consanguineous families and 141 lacked family history (simplex). The age range was embryos to 18 years. A likely causal variant (pathogenic/likely pathogenic) was identified in 63.8% (307/481), a much higher yield compared to the general diagnostic yield (43%) from the same population. The predominance of recessive lethal alleles allowed us to implement molecular autopsy by proxy in 55 couples, and the yield was similarly high (63.6%). We also note the occurrence of biallelic lethal forms of typically non-lethal dominant disorders, sometimes representing a novel bona fide biallelic recessive disease trait. Forty-six disease genes with no OMIM phenotype were identified in the course of this study. The presented data support the candidacy of two other previously reported novel disease genes (FAAH2 and MSN). The focus on lethal phenotypes revealed many examples of interesting phenotypic expansion as well as remarkable variability in clinical presentation. Furthermore, important insights into population genetics and variant interpretation are highlighted based on the results.ConclusionsMolecular autopsy, broadly defined, proved to be a helpful clinical approach that provides unique insights into lethal variants and the clinical annotation of the human genome.
  • Study on the Effect of Size on InGaN Red Micro-LEDs

    Horng, Ray-Hua; Ye, Chun-Xin; Chen, Po-Wei; Iida, Daisuke; Ohkawa, Kazuhiro; Wu, Yuh-Renn; Wuu, Dong-Sing (Research Square Platform LLC, 2021-10-13) [Preprint]
    In this research, five sizes (100⊆100, 75⊆75, 50⊆50, 25⊆25, 10⊆10 µm2) of InGaN red micro-light emitting diode (LED) dies are produced using laser-based direct writing and maskless technology. It is observed that with increasing injection current, the smaller the size of the micro-LED, the more obvious the blue shift of the emission wavelength. When the injection current is increased from 0.1 to 1 mA, the emission wavelength of the 10×10 µm2 micro-LED is shifted from 617.15 to 576.87 nm. The obvious blue shift is attributed to the stress release and high current density injection. Moreover, the output power density is very similar for smaller chip micro-LEDs at the same injection current density. This behavior is different from AlGaInP micro-LEDs. The sidewall defect is more easily repaired by passivation, which is similar to the behavior of blue micro-LEDs. The results indicate that the red InGaN epilayer structure provides an opportunity to realize the full color LEDs fabricated by GaN-based LEDs.
  • Ego4D: Around the World in 3,000 Hours of Egocentric Video

    Grauman, Kristen; Westbury, Andrew; Byrne, Eugene; Chavis, Zachary; Furnari, Antonino; Girdhar, Rohit; Hamburger, Jackson; Jiang, Hao; Liu, Miao; Liu, Xingyu; Martin, Miguel; Nagarajan, Tushar; Radosavovic, Ilija; Ramakrishnan, Santhosh Kumar; Ryan, Fiona; Sharma, Jayant; Wray, Michael; Xu, Mengmeng; Xu, Eric Zhongcong; Zhao, Chen; Bansal, Siddhant; Batra, Dhruv; Cartillier, Vincent; Crane, Sean; Do, Tien; Doulaty, Morrie; Erapalli, Akshay; Feichtenhofer, Christoph; Fragomeni, Adriano; Fu, Qichen; Fuegen, Christian; Gebreselasie, Abrham; Gonzalez, Cristina; Hillis, James; Huang, Xuhua; Huang, Yifei; Jia, Wenqi; Khoo, Weslie; Kolar, Jachym; Kottur, Satwik; Kumar, Anurag; Landini, Federico; Li, Chao; Li, Yanghao; Li, Zhenqiang; Mangalam, Karttikeya; Modhugu, Raghava; Munro, Jonathan; Murrell, Tullie; Nishiyasu, Takumi; Price, Will; Puentes, Paola Ruiz; Ramazanova, Merey; Sari, Leda; Somasundaram, Kiran; Southerland, Audrey; Sugano, Yusuke; Tao, Ruijie; Vo, Minh; Wang, Yuchen; Wu, Xindi; Yagi, Takuma; Zhu, Yunyi; Arbelaez, Pablo; Crandall, David; Damen, Dima; Farinella, Giovanni Maria; Ghanem, Bernard; Ithapu, Vamsi Krishna; Jawahar, C. V.; Joo, Hanbyul; Kitani, Kris; Li, Haizhou; Newcombe, Richard; Oliva, Aude; Park, Hyun Soo; Rehg, James M.; Sato, Yoichi; Shi, Jianbo; Shou, Mike Zheng; Torralba, Antonio; Torresani, Lorenzo; Yan, Mingfei; Malik, Jitendra (arXiv, 2021-10-13) [Preprint]
    We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,025 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 855 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, we present a host of new benchmark challenges centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, we aim to push the frontier of first-person perception. Project page:
  • Relation-aware Video Reading Comprehension for Temporal Language Grounding

    Gao, Jialin; Sun, Xin; Xu, Mengmeng; Zhou, Xi; Ghanem, Bernard (arXiv, 2021-10-12) [Preprint]
    Temporal language grounding in videos aims to localize the temporal span relevant to the given query sentence. Previous methods treat it either as a boundary regression task or a span extraction task. This paper will formulate temporal language grounding into video reading comprehension and propose a Relation-aware Network (RaNet) to address it. This framework aims to select a video moment choice from the predefined answer set with the aid of coarse-and-fine choice-query interaction and choice-choice relation construction. A choice-query interactor is proposed to match the visual and textual information simultaneously in sentence-moment and token-moment levels, leading to a coarse-and-fine cross-modal interaction. Moreover, a novel multi-choice relation constructor is introduced by leveraging graph convolution to capture the dependencies among video moment choices for the best choice selection. Extensive experiments on ActivityNet-Captions, TACoS, and Charades-STA demonstrate the effectiveness of our solution. Codes will be released soon.
  • Weak-strong uniqueness for Maxwell-Stefan systems

    Huo, Xiaokai; Jüngel, Ansgar; Tzavaras, Athanasios (arXiv, 2021-10-11) [Preprint]
    The weak-strong uniqueness for Maxwell--Stefan systems and some generalized systems is proved. The corresponding parabolic cross-diffusion equations are considered in a bounded domain with no-flux boundary conditions. The key points of the proofs are various inequalities for the relative entropy associated to the systems and the analysis of the spectrum of a quadratic form capturing the frictional dissipation. The latter task is complicated by the singular nature of the diffusion matrix. This difficulty is addressed by proving its positive definiteness on a subspace and using the Bott--Duffin matrix inverse. The generalized Maxwell--Stefan systems are shown to cover several known cross-diffusion systems for the description of tumor growth and physical vapor deposition processes.
  • Efficient importance sampling for large sums of independent and identically distributed random variables

    Ben Rached, Nadhir; Haji-Ali, Abdul Lateef; Rubino, Gerardo; Tempone, Raul (Statistics and Computing, Springer Science and Business Media LLC, 2021-10-11) [Article]
    We discuss estimating the probability that the sum of nonnegative independent and identically distributed random variables falls below a given threshold, i.e., P(∑i=1NXi≤γ), via importance sampling (IS). We are particularly interested in the rare event regime when N is large and/or γ is small. The exponential twisting is a popular technique for similar problems that, in most cases, compares favorably to other estimators. However, it has some limitations: (i) It assumes the knowledge of the moment-generating function of Xi and (ii) sampling under the new IS PDF is not straightforward and might be expensive. The aim of this work is to propose an alternative IS PDF that approximately yields, for certain classes of distributions and in the rare event regime, at least the same performance as the exponential twisting technique and, at the same time, does not introduce serious limitations. The first class includes distributions whose probability density functions (PDFs) are asymptotically equivalent, as x→ 0 , to bxp, for p> - 1 and b> 0. For this class of distributions, the Gamma IS PDF with appropriately chosen parameters retrieves approximately, in the rare event regime corresponding to small values of γ and/or large values of N, the same performance of the estimator based on the use of the exponential twisting technique. In the second class, we consider the Log-normal setting, whose PDF at zero vanishes faster than any polynomial, and we show numerically that a Gamma IS PDF with optimized parameters clearly outperforms the exponential twisting IS PDF. Numerical experiments validate the efficiency of the proposed estimator in delivering a highly accurate estimate in the regime of large N and/or small γ.
  • Graph Models for Biological Pathway Visualization: State of the Art and Future Challenges

    Wu, Hsiang-Yun; Nöllenburg, Martin; Viola, Ivan (arXiv, 2021-10-10) [Preprint]
    The concept of multilayer networks has become recently integrated into complex systems modeling since it encapsulates a very general concept of complex relationships. Biological pathways are an example of complex real-world networks, where vertices represent biological entities, and edges indicate the underlying connectivity. For this reason, using multilayer networks to model biological knowledge allows us to formally cover essential properties and theories in the field, which also raises challenges in visualization. This is because, in the early days of pathway visualization research, only restricted types of graphs, such as simple graphs, clustered graphs, and others were adopted. In this paper, we revisit a heterogeneous definition of biological networks and aim to provide an overview to see the gaps between data modeling and visual representation. The contribution will, therefore, lie in providing guidelines and challenges of using multilayer networks as a unified data structure for the biological pathway visualization.
  • Run Time Assurance for Safety-Critical Systems: An Introduction to Safety Filtering Approaches for Complex Control Systems

    Hobbs, Kerianne; Mote, Mark; Abate, Matthew; Coogan, Samuel; Feron, Eric (arXiv, 2021-10-07) [Preprint]
    Run Time Assurance (RTA) Systems are online verification mechanisms that filter an unverified primary controller output to ensure system safety. The primary control may come from a human operator, an advanced control approach, or an autonomous control approach that cannot be verified to the same level as simpler control systems designs. The critical feature of RTA systems is their ability to alter unsafe control inputs explicitly to assure safety. In many cases, RTA systems can functionally be described as containing a monitor that watches the state of the system and output of a primary controller, and a backup controller that replaces or modifies control input when necessary to assure safety. An important quality of an RTA system is that the assurance mechanism is constructed in a way that is entirely agnostic to the underlying structure of the primary controller. By effectively decoupling the enforcement of safety constraints from performance-related objectives, RTA offers a number of useful advantages over traditional (offline) verification. This article provides a tutorial on developing RTA systems.
  • BlockPolish: accurate polishing of long-read assembly via block divide-and-conquer.

    Huang, Neng; Nie, Fan; Ni, Peng; Gao, Xin; Luo, Feng; Wang, Jianxin (Briefings in bioinformatics, Oxford University Press (OUP), 2021-10-07) [Article]
    Long-read sequencing technology enables significant progress in de novo genome assembly. However, the high error rate and the wide error distribution of raw reads result in a large number of errors in the assembly. Polishing is a procedure to fix errors in the draft assembly and improve the reliability of genomic analysis. However, existing methods treat all the regions of the assembly equally while there are fundamental differences between the error distributions of these regions. How to achieve very high accuracy in genome assembly is still a challenging problem. Motivated by the uneven errors in different regions of the assembly, we propose a novel polishing workflow named BlockPolish. In this method, we divide contigs into blocks with low complexity and high complexity according to statistics of aligned nucleotide bases. Multiple sequence alignment is applied to realign raw reads in complex blocks and optimize the alignment result. Due to the different distributions of error rates in trivial and complex blocks, two multitask bidirectional Long short-term memory (LSTM) networks are proposed to predict the consensus sequences. In the whole-genome assemblies of NA12878 assembled by Wtdbg2 and Flye using Nanopore data, BlockPolish has a higher polishing accuracy than other state-of-the-arts including Racon, Medaka and MarginPolish & HELEN. In all assemblies, errors are predominantly indels and BlockPolish has a good performance in correcting them. In addition to the Nanopore assemblies, we further demonstrate that BlockPolish can also reduce the errors in the PacBio assemblies. The source code of BlockPolish is freely available on Github (
  • Colorful surfaces for radiative cooling

    Zhou, Lyu; Rada, Jacob; Song, Haomin; Ooi, Boon S.; Yu, Zongfu; Gan, Qiaoqiang (Journal of Photonics for Energy, SPIE-Intl Soc Optical Eng, 2021-10-07) [Article]
    Daytime radiative cooling has attracted extensive research interest due to its potential impact for energy sustainability. To achieve subambient radiative cooling during the daytime, a white surface that strongly scatters incident solar light is normally desired. However, in many practical applications (e.g., roofing materials and car coatings), colored surfaces are more popular. Because of this, there is a strong desire to develop colorful surfaces for radiative cooling. We summarize the general design criteria of radiative cooling materials with different colors and discuss the limitations in cooling performance. Major efforts on this specific topic are reviewed with some suggested topics for future investigation.

View more