Control of the CDC48A segregase by the plant UBX-containing (PUX) protein family(2021-05) [Thesis]
Advisor: Arold, Stefan T.
Committee members: Blilou, Ikram; Jaremko, LukaszIn plants, AAA-adenosine triphosphatase (ATPase) Cell Division Control Protein 48 (CDC48) uses the force generated through ATP hydrolysis to pull, extract, and unfold ubiquitylated or sumoylated proteins from the membrane, chromatin, or protein complexes. The resulting changes in protein or RNA content are an important means for plants to control protein homeostasis and thereby adapt to shifting environmental conditions. The activity and targeting of CDC48 are controlled by adaptor proteins, of which the plant ubiquitin regulatory X (UBX) domain-containing (PUX) proteins constitute the largest and most versatile family. However, few PUX proteins have been structurally or functionally characterized and how they participate in the substrate processing of CDC48A is not fully understood. Here, we first performed a comparative bioinformatic analysis, in which we found that the PUX proteins can be functionally divided into six types. We used this classification as a guide for our experimental efforts to elucidate how PUX proteins mediate client recognition and delivery for CDC48A-mediated unfolding. As a first step in this experimental analysis, we cloned and expressed a number of PUX protein constructs, we assessed their interaction features, and obtained crystals for several PUX domains. These bioinformatic and experimental results provide a basis for the in-depth structural and functional analysis of how PUX proteins control the CDC48A segregase.
Performance comparison of Hierarchical Non-Terrestrial Networks for 6G(2021-04) [Thesis]
Advisor: Alouini, Mohamed-Slim
Committee members: Fariborzi, Hossein; Shihada, Basem; Park, KihongThis thesis investigates the fundamental performance of Hierarchical Non-Terrestrial Networks for the 6th generation (6G). 6G communication research is currently focus- ing on non-terrestrial networks (NTNs) to promote ubiquitous and ultra-high-capacity global connectivity. Specifically, multi-layered hierarchical networks, i.e., the orches- tration among different aerial/space platforms, including Unmanned Aerial Vehicles (UAVs), High Altitude Platforms (HAPs), and satellites co-operating at different al- titudes, currently represents one the most attractive technological options to solve coverage and latency constraints associated with the NTN paradigm. However, there are still several issues to be resolved for proper network design. In this thesis, we in- vestigate the propagation model in air/space links and then evaluate the performance of different multi-layered non-terrestrial configurations, and then provide guidelines on the optimal working point(s) for which it is possible to achieve a good compro- mise between improved system flexibility and network performance, with respect to a baseline standalone deployment.
Frequency-modulation Stimulated Raman Scattering microscopy with an Acousto-Optic Tunable Filter(2021-04) [Thesis]
Advisor: Liberale, Carlo
Committee members: Habuchi, Satoshi; Hauser, CharlotteStimulated Raman Scattering (SRS) is a Coherent Raman microscopy method that has been increasingly employed in recent years for highly-specific, label-free, and high-speed bioimaging. Compared to a similar Coherent Raman method, the so-called Coherent Anti-Stokes Scattering (CARS) microscopy, it exhibits advantages such as the absence of nonresonant background (NRB) and the linearity of the signal intensity on the concentration of molecules of interest. However, SRS can be affected by unwanted background signals that hinder the acquisition of an accurate Raman information. These unwanted signals are generated by parasitic effects that are difficult to suppress in standard SRS setups. Here, I present a frequency-modulation (FM) SRS technique via an Acousto-Optic Tunable ioilter (AOTF), describing its implementation on Vibra Lab setup and assessing its efficiency with imaging results. The FM technique provides a cancellation of the unwanted background signals, maintaining intact the SRS information. It is based on the weak spectral dependence of the parasitic effects as compared to the high spectral specificity of the SRS signal. The proposed scheme presents a few advantages when compared with other solutions presented in the literature. In particular, it doesn't require a complex setup configuration, and it can be used seamlessly in a very broad range of the vibrational spectrum.
Role of CD26/DPPIV in the Homing and Engraftment of Long-Term CD34- Negative Hematopoietic Stem Cells(2021-04) [Thesis]
Advisor: Merzaban, Jasmeen
Committee members: Blilou, Ikram; Pain, ArnabCD26/DPPIV is a dipeptidyl peptidase that cleaves and destroys a variety of substrates such as the chemokine SDF-1α, a chemokine expressed along bone marrow endothelium, which is essential for the recruitment of hematopoietic stem cells (HSCs) via binding with its receptor CXCR4 to the bone marrow. Thus, CD26 is thought to interfere with the second step, chemokine/chemokine receptor interactions, of the cellular migration paradigm. To further study the role of CD26 in the migration of HSCs, we screened several human leukemic cell lines to find a model cell line that expresses active CD26 and discovered that the pro-monocytic cell line, U937 was optimal for this purpose. U937 cells were used to optimize a variety of assays including an CD26 activity assay and transwell migration assay with and without the use of a CD26 inhibitor, Diprotin A. Then, we isolated short-term and long-term HSCs from the bone marrow of C57BL/6N mice using a combination of surface markers and a fluorescence-activated cell sorter. The expression levels of Step 2’s homing molecules were measured by FACS in both fractions of HSCs. Interestingly, we detected differences in the expression of CD26 between these two populations that may help explain the inability of long-term HSCs to migrate to the bone marrow. Thus, through the use of a CD26 inhibitor the long-term HSCS migration to the bone marrow could be enhanced, leading to a prolonged and efficient stem cell engraftment activity. Such studies are could help develop protocols to improve stem cell engraftment for patients suffering from hematological diseases such as leukemia.
Kinetic modeling and packed bed membrane reactor scale-up for ammonia decomposition(2021-04) [Thesis]
Advisor: Castaño, Pedro
Committee members: Gascon, Jorge; Roberts, William L.Hydrogen economy is capitalizing the decarbonization of transport and industrial sectors. Ammonia is an attractive intermediate to store and transport hydrogen, due to its low production cost, well developed storage and transportation infrastruc- ture, high hydrogen density in its liquified form (for transportation) and the potential production from renewable energy sources. Although there have been significant ad- vancements in catalyst development for ammonia decomposition, the potential of this technology cannot be fully exploited until significant process development is made. In this sense, catalytic membrane reactors show promising features and performances. In this work, ammonia decomposition has been studied using the following ap- proach: (1) Catalytic Packed Bed Reactor (CPBR) and kinetic modeling, (2) Cat- alytic Packed Bed Membrane Reactor (CPBMR) modeling and (3) CPBMR scale-up. Stage (1) was performed using Ru-K/CaO and Co-Ce catalysts over a wide range of experimental conditions (including pressures up to 16 bar). Stage (2) includes 1-D and 2-D models that were further validated experimentally, also using different software to tackle the stage (3), which aims to give the optimized geometry and properties of a CPBMR for a production of 5 N m3 h−1 of high purity H2 . The results presented in this Thesis enabled to: (1) obtain a reliable kinetic model capable of describing the ammonia decomposition under a wide range of operating conditions, using Ru-K/CaO and Co-Ce catalysts. (2) identify a range of operat- ing conditions where the CPBMR performs better than the CPBR in terms of NH3 conversion, H2 recovery and H2 purity. This range includes: reaction temperature between 250◦C and 500◦C; reaction pressures between 1 and 16 bar; space times be- tween 1 and 15 gcat h mol−1 and H2 permeate pressure higher than the atmospheric pressure (up to 5 bar). (3) scale-up the CPBMR for ammonia decomposition at a pilot scale, encountering that a pilot plant for a production of 5 N m3 h−1 of pure H2 ( >99.99%) could be obtained with a relatively small multitubular arraignment, that might be even smaller than the needed for the same product using other technology.
High-throughput Assay for Quantifying Transgenerational Epigenetic Inheritance in C. elegans(2021-04) [Thesis]
Advisor: Frøkjær-Jensen, Christian
Committee members: Mahfouz, Magdy M.; Aranda, ManuelThis thesis describes my work to develop methods and assays to study transgenerational inheritance in the widely used genetic model organism Caenorhabditis elegans (C. elegans). In the first chapter, I describe a novel method that uses an exogenous histamine-selective chloride channel (HisCl1) for negative selection in transgenesis. C. elegans transgenesis is a core technique used by most laboratories and often requires distinguishing between rare animals with a single-copy transgene inserted into the genome from more frequent animals that carry multiple copies of the transgene in extra-chromosomal arrays. I demonstrate that histamine-selection induces rapid and irreversible paralysis in only array animals thus allowing quick identification of the desired transgenic animals. In the second chapter, I develop a high-throughput assay for quantifying transgenerational epigenetic inheritance of endogenous gene silencing. Small RNA -mediated gene silencing leads to an increased incidence of males in the population which can be inherited for four to six generations. I identify a fluorescent marker that specifically fluoresces in males and show that I can use a large-particle particle sorter to quantify the frequency of males in a population. This automated system will allow me to follow inheritance patterns over at least ten generations in various mutant backgrounds in parallel to determine the genetic basis and the rules of epigenetic inheritance.
Optimization of UV and bacteriophages as an alternative chemical-free approach for membrane cleaning(2021-03) [Thesis]
Advisor: Hong, Pei-Ying
Committee members: Nunes, Suzana Pereira; Daffonchio, DanieleAnaerobic membrane bioreactors (AnMBR) have been established as an efficient method of wastewater treatment to obtain high-quality effluent with low energy consumption. However, membrane fouling leading to flux reduction and an increase in operational costs can negate potential benefits associated with AnMBR. Today’s conventional membrane cleaning process includes physical and chemical approaches, both of which have their own drawback. For this reason, the biological approach was proposed as an alternative to dangerous, energy-consuming, and environmentally unsafe treatment techniques. The combination of UV-C and bacteriophage offers an alternative chemical-free approach for biofouling control. This dissertation aims to test the different order of using UV-C and bacteriophage to clean anaerobic membrane. This dissertation also demonstrates a proof-of-concept to achieve semi-online cleaning using UV-C and bacteriophage, thus increasing the feasibility of described technology. As a result of this work, it was shown that preliminary UV exposure enhances bacteriophage propagation into thick biofilms, and that the bacteriophages are able to affect total cell number and extracellular polymeric substances (EPS) compared to the control. Compared to the control, the semi-online cleaning strategy also resulted in a membrane that took a longer time for the transmembrane pressure to increase in the next operation cycle after cleaning.
An Empirical Study of the Distributed Ellipsoidal Trust Region Method for Large Batch Training(2021-02-10) [Thesis]
Advisor: Keyes, David E.
Committee members: Wonka, Peter; Zhang, XiangliangNeural networks optimizers are dominated by rst-order methods, due to their inexpensive computational cost per iteration. However, it has been shown that rstorder optimization is prone to reaching sharp minima when trained with large batch sizes. As the batch size increases, the statistical stability of the problem increases, a regime that is well suited for second-order optimization methods. In this thesis, we study a distributed ellipsoidal trust region model for neural networks. We use a block diagonal approximation of the Hessian, assigning consecutive layers of the network to each process. We solve in parallel for the update direction of each subset of the parameters. We show that our optimizer is t for large batch training as well as increasing number of processes.
Assembly of Two CCDD Rice Genomes, Oryza grandiglumis and Oryza latifolia, and the Study of Their Evolutionary Changes(2021-01) [Thesis]
Advisor: Wing, Rod Anthony
Committee members: Gojobori, Takashi; Zuccolo, AndreaEvery day more than half of the world consumes rice as a primary dietary resource. Thus, rice is one of the most important food crops in the world. Rice and its wild relatives are part of the genus Oryza. Studying the genome structure, function, and evolution of Oryza species in a comparative genomics framework is a useful approach to provide a wealth of knowledge that can significantly improve valuable agronomic traits. The Oryza genus includes 27 species, with 11 different genome types as identified by genetic and cytogenetic analyses. Six genome types, including that of domesticated rice - O. sativa and O. glaberrima, are diploid, and the remaining 5 are tetraploids. Three of the tetraploid species contain the CCDD genome types (O. grandiglumis, O. latifolia, and O. alta), which arose less than 2 million years ago. Polyploidization is one of the major contributors to evolutionary divergence and can thereby lead to adaptation to new environmental niches. An important first step in the characterization of the polyploid Oryza species is the generation of a high-quality reference genome sequence. Unfortunately, up until recently, the generation of such an important and fundamental resource from polyploid species has been challenging, primarily due to their genome complexity and repetitive sequence content. In this project, I assembled two high-quality genomes assemblies for O. grandiglumis and O. latifolia using PacBio long-read sequencing technology and an assembly pipeline that employed 3 genome assemblers (i.e., Canu/2.0, Mecat2, and Flye/2.5) and multiple rounds of sequence polishing with 5 both Arrow and Pilon/1.23. After the primary assembly, sequence contigs were arranged into pseudomolecules, and homeologous chromosomes were assigned to their respective genome types (i.e., CC or DD). Finally, the assemblies were extensively edited manually to close as many gaps as possible. Both assemblies were then analyzed for transposable element and structural variant content between species and homoeologous chromosomes. This enabled us to study the evolutionary divergence of those two genomes, and to explore the possibility of neo-domesticating either species in future research for my PhD dissertation.
Oxygen Modulation of thermal tolerance in the branching coral Stylophora pistillata(2021-01) [Thesis]
Advisor: Duarte, Carlos M.
Committee members: Aranda, Manuel; Daffonchio, DanieleCoral reef ecosystems are under increasing threat from ocean warming and deoxygenation. Mass coral bleaching events in recent years have been linked to marine heatwaves but reporting of hypoxia-induced bleaching has also been increasing. Oxygen availability in coral reefs is driven by community metabolism and they experience a dynamic range of oxygen concentrations throughout diel cycles, hyperoxia during the day and hypoxia during the night. It has been suggested that the highest oxygen concentrations coincide with the hottest part of the day and this may protect marine taxa from high temperatures. We evaluated experimentally whether excess oxygen availability would increase the thermal threshold of the branching coral Stylophora pistillata, from the Southern Red Sea. We did this by exposing coral fragments of this species to varying dissolved oxygen concentrations (hypoxia, normoxia and hyperoxia) and a short-term temperature ramping regime (1˚C h-1). Hyperoxia did extend the thermal tolerance of S. pistillata fragments, with an LT50 of 39.1˚C as opposed to 39.0˚C for the normoxic treatment and 38.7˚C for the hypoxic treatment. Hyperoxia also increased respiration and gross photosynthesis and had a negative effect on photochemical efficiency at high temperatures. Net photosynthesis, P:R ratio and symbiont density were not significantly affected by oxygen concentration. Corals in this experiment displayed exceedingly high thermal thresholds, which were at least 2˚C higher than previously reported for the same species in the Central Red Sea. The corals used in the experiment had previously survived mass bleaching events in 2015 and hence we may have selected for individuals adapted to thermal stress. This is the first study to investigate the role of oxygen in the thermal tolerance of hermatypic corals and the first assessment of thermal thresholds from corals in the Southern Red Sea, where previously thermal thresholds have been based on a 1-2˚C increase in maximum mean monthly temperatures and visual bleaching observations. This highlights the need for increased experimental assessments of thermal thresholds in the Southern regions of the Red Sea and the important role of oxygen in moderating thermal stress.
Modeling Human Learning in Games(2020-12) [Thesis]
Advisor: Shamma, Jeff S.
Committee members: Feron, Eric; Laleg-Kirati, Taous-MeriemHuman-robot interaction is an important and broad area of study. To achieve success- ful interaction, we have to study human decision making rules. This work investigates human learning rules in games with the presence of intelligent decision makers. Par- ticularly, we analyze human behavior in a congestion game. The game models traffic in a simple scenario where multiple vehicles share two roads. Ten vehicles are con- trolled by the human player, where they decide on how to distribute their vehicles on the two roads. There are hundred simulated players each controlling one vehicle. The game is repeated for many rounds, allowing the players to adapt and formulate a strategy, and after each round, the cost of the roads and visual assistance is shown to the human player. The goal of all players is to minimize the total congestion experienced by the vehicles they control. In order to demonstrate our results, we first built a human player simulator using Fictitious play and Regret Matching algorithms. Then, we showed the passivity property of these algorithms after adjusting the passivity condition to suit discrete time formulation. Next, we conducted the experiment online to allow players to participate. A similar analysis was done on the data collected, to study the passivity of the human decision making rule. We observe different performances with different types of virtual players. However, in all cases, the human decision rule satisfied the passivity condition. This result implies that human behavior can be modeled as passive, and systems can be designed to use these results to influence human behavior and reach desirable outcomes.
On Using D2D Collaboration and a DF-CF Relaying Scheme to Mitigate Channel Interference(2020-12) [Thesis]
Advisor: Alouini, Mohamed-Slim
Committee members: Shihada, Basem; Park, KihongGiven the exponentially increasing number of connected devices to the network which will lead to a larger number of installed celluar towers and base stations that are in closer proximity to one another when compared to the current cellular network setup, and the increasing demand of higher data rates by end users, it becomes essential to investigate new methods that will more effectively mitigate the larger interference introduced by the more packed celluar grid and that result in higher data rates. This paper investigates using Device-to-Device communication where neighboring users can cooperate to mitigate the correlated interference they both receive, where one user acts as a relay and the other as the intended destination of a broadcast message sent by the source base station. The setup studied utalizes a non-orthogonal multiple access (NOMA) scheme and a combined decode-forward and compress-forward relaying scheme. We show that this combined scheme outperforms the individual schemes for some channels and network setups, or reduces to either scheme when the combination does not offer any achievable rate gains. The performance of each scheme is measured with respect to the locations of the base station and the two devices, and to the capacity of the digital link between the users.
Effect of Anthropogenic Noise on Reef Macroinvertebrates (Lambis lambis and Tridacna maxima)(2020-12) [Thesis]
Advisor: Duarte, Carlos M.
Committee members: Benzoni, Francesca; Daffonchio, DanieleBoat noise has been shown to distract and cause harm to many marine organisms. Most of the study effort has been focused on fish & marine mammals, even though invertebrates represent 99% of all living organisms, both terrestrial and aquatic. The small amount of studies conducted on invertebrates have demonstrated a clear negative correlation between anthropogenic noise pollution and invertebrate well-being. The small giant clam Tridacna maxima and the spider conch Lambis lambis are two invertebrate species which play key roles in coral reef ecosystems, and are little studied for the effects of noise disturbance. If their behaviour is significantly affected by boat noise, this could cause a knock on effect on the rest of the reef ecosystem. T. maxima functions as prey to many fish species, contributing up to 9% of the reef’s calcium carbonate budget, as well as playing a role in nutrient cycling. L. lambis is essential prey for molluscivore snails and certain elasmobranchs, and through its herbivorous feeding, helps to keep filamentous algae levels low. Moreover, coral reefs are reliant on the biological sound primarily created by invertebrates, such as snapping shrimp, to attract larvae from the pelagic environment for settlement. During experiments conducted in February 2020 in the Red Sea, Daily Diary Monitoring (DDMT) smart tags were used to measure the reactions of T. maxima and L. lambis respectively during underwater playback of boat noise compared with ambient reef sound. It was found that both T. maxima and L. lambis exhibited behavioral changes during the boat noise treatment. The implications of this study is that individuals of L. lambis and T. maxima may spend energy averting the invisible "threat" of boat noise, rather than feeding and/or photosynthesizing, in the case of T. maxima. If this hinders the acquisition of food/production, the whole reef ecosystem may be compromised in a knock-on effect from these producers and primary consumers. As boat noise is prevalent on inshore Red Sea reefs, shown by a 24-hour acoustic analysis of a patch reef sound pressure levels (SPL), invertebrates might be affected on a large scale in the Red Sea.
BICNet: A Bayesian Approach for Estimating Task Effects on Intrinsic Connectivity Networks in fMRI Data(2020-11-25) [Thesis]
Advisor: Ombao, Hernando
Committee members: Sun, Ying; Laleg-Kirati, Taous-Meriem; Ting, Chee-MingIntrinsic connectivity networks (ICNs) refer to brain functional networks that are consistently found under various conditions, during tasks or at rest. Some studies demonstrated that while some stimuli do not impact intrinsic connectivity, other stimuli actually activate intrinsic connectivity through suppression, excitation, moderation or modi cation. Most analyses of functional magnetic resonance imaging (fMRI) data use ad-hoc methods to estimate the latent structure of ICNs. Modeling the effects on ICNs has also not been fully investigated. Bayesian Intrinsic Connectivity Network (BICNet) captures the ICN structure with We propose a BICNet model, an extended Bayesian dynamic sparse latent factor model, to identify the ICNs and quantify task-related effects on the ICNs. BICNet has the following advantages: (1) It simultaneously identifies the individual and group-level ICNs; (2) It robustly identifies ICNs by jointly modeling resting-state fMRI (rfMRI) and task-related fMRI (tfMRI); (3) Compared to independent component analysis (ICA)-based methods, it can quantify the difference of ICNs amplitudes across different states; (4) The sparsity of ICNs automatically performs feature selection, instead of ad-hoc thresholding. We apply BICNet to the rfMRI and language tfMRI data from the Human Connectome Project (HCP) and identify several ICNs related to distinct language processing functions.
Deep GCNs with Random Partition and Generalized Aggregator(2020-11-25) [Thesis]
Advisor: Ghanem, Bernard
Committee members: Thabet, Ali Kassem; Zhang, XiangliangGraph Convolutional Networks (GCNs) draws significant attention due to its power of representation learning on graphs. Recent works developed frameworks to train deep GCNs. Such works show impressive results in tasks like point cloud classification and segmentation, and protein interaction prediction. While for large-scale graphs, doing full-batch training by GCNs is still challenging especially when GCNs go deeper. By fully analyzing a clustering-based mini-batch training algorithm ClusterGCN, we propose random partition which is a more efficient and effective method to implement mini-batch training. Besides, selecting different permutation invariance function (such as max, mean or add) for neighbors’ information aggregation will result in every different results. Therefore, we propose to alleviate it by introducing a novel Generalized Aggregation Function. In this thesis, I analyze the drawbacks caused by ClusterGCN and discuss about its limits. I further compare the performance of ClusterGCN with random partition and the final experimental results show that simple random partition outperforms ClusterGCN with very obvious advantageous for node property prediction task. For the techniques which are commonly used to make GCNs go deeper, I demonstrate a better way of applying residual connections (pre-activation) to stack more layers for GCNs. Last, I show the complete work of training deeper GCNs with generalized aggregators and display the promising results over several datasets from the Open Graph Benchmark (OGB).
Imitation Learning based on Generative Adversarial Networks for Robot Path Planning(2020-11-24) [Thesis]
Advisor: Michels, Dominik L.
Committee members: Wonka, Peter; Moshkov, MikhailRobot path planning and dynamic obstacle avoidance are defined as a problem that robots plan a feasible path from a given starting point to a destination point in a nonlinear dynamic environment, and safely bypass dynamic obstacles to the destination with minimal deviation from the trajectory. Path planning is a typical sequential decision-making problem. Dynamic local observable environment requires real-time and adaptive decision-making systems. It is an innovation for the robot to learn the policy directly from demonstration trajectories to adapt to similar state spaces that may appear in the future. We aim to develop a method for directly learning navigation behavior from demonstration trajectories without defining the environment and attention models, by using the concepts of Generative Adversarial Imitation Learning (GAIL) and Sequence Generative Adversarial Network (SeqGAN). The proposed SeqGAIL model in this thesis allows the robot to reproduce the desired behavior in different situations. In which, an adversarial net is established, and the Feature Counts Errors reduction is utilized as the forcing objective for the Generator. The refinement measure is taken to solve the instability problem. In addition, we proposed to use the Rapidly-exploring Random Tree* (RRT*) with pre-trained weights to generate adequate demonstration trajectories in dynamic environment as the training data, and this idea can effectively overcome the difficulty of acquiring huge training data.
Leveraging Graph Convolutional Networks for Point Cloud Upsampling(2020-11-16) [Thesis]
Advisor: Ghanem, Bernard
Committee members: Wonka, Peter; Pottmann, HelmutDue to hardware limitations, 3D sensors like LiDAR often produce sparse and noisy point clouds. Point cloud upsampling is the task of converting such point clouds into dense and clean ones. This thesis tackles the problem of point cloud upsampling using deep neural networks. The effectiveness of a point cloud upsampling neural network heavily relies on the upsampling module and the feature extractor used therein. In this thesis, I propose a novel point upsampling module, called NodeShuffle. NodeShuffle leverages Graph Convolutional Networks (GCNs) to better encode local point information from point neighborhoods. NodeShuffle is versatile and can be incorporated into any point cloud upsampling pipeline. Extensive experiments show how NodeShuffle consistently improves the performance of previous upsampling methods. I also propose a new GCN-based multi-scale feature extractor, called Inception DenseGCN. By aggregating features at multiple scales, Inception DenseGCN learns a hierarchical feature representation and enables further performance gains. I combine Inception DenseGCN with NodeShuffle into the proposed point cloud upsampling network called PU-GCN. PU-GCN sets new state-of-art performance with much fewer parameters and more efficient inference.
Prediction of Novel Virus–Host Protein Protein Interactions From Sequences and Infectious Disease Phenotypes(2020-11-11) [Thesis]
Advisor: Tegner, Jesper
Committee members: Hoehndorf, Robert; Ombao, HernandoInfectious diseases from novel viruses have become a major public health concern. Rapid identification of virus–host interactions can reveal mechanistic insights into infectious diseases and shed light on potential treatments. Current computational prediction methods for novel viruses are based mainly on protein sequences. However, it is not clear to what extent other important features, such as the symptoms caused by the viruses, could contribute to a predictor. Disease phenotypes (i.e., signs and symptoms) are readily accessible from clinical diagnosis and we hypothesize that they may act as a potential proxy and an additional source of information for the underlying molecular interactions between the pathogens and hosts. We developed DeepViral, a deep learning based method that predicts protein– protein interactions (PPI) between humans and viruses. Motivated by the potential utility of infectious disease phenotypes, we first embedded human proteins and viruses in a shared space using their associated phenotypes and functions, supported by formalized background knowledge from biomedical ontologies. By jointly learning from protein sequences and phenotype features, DeepViral significantly improves over existing sequence-based methods for intra- and inter-species PPI prediction. Lastly, we propose a novel experimental setup to realistically evaluate prediction methods for novel viruses.
Spatio-Temporal Prediction and Stochastic Simulation for Large-Scale Nonstationary Processes(2020-11-04) [Thesis]
Advisor: Sun, Ying
Committee members: McCabe, Matthew; Wikle, Christopher K.; Zhang, XiangliangThere has been an increasing demand for describing, predicting, and drawing inferences for various environmental processes, such as air pollution and precipitation. Environmental statistics plays an important role in many related applications, such as weather-related risk assessment for urban design and crop growth. However, modeling the spatio-temporal dynamics of environmental data is challenging due to their inherent high variability and nonstationarity. This dissertation is composed of four signi cant contributions to the modeling, simulation, and prediction of spatiotemporal processes using statistical techniques and machine learning algorithms. This dissertation rstly focuses on the Gaussian process emulators of the numerical climate models over a large spatial region, where the spatial process exhibits nonstationarity. The proposed method allows for estimating a rich class of nonstationary Mat ern covariance functions with spatially varying parameters. The e cient estimation is achieved by local-polynomial tting of the covariance parameters. To extend the applicability of this method to large-scale computations, the proposed method is implemented by developing software with high-performance computing architectures for nonstationary Gaussian process estimation and simulation. The developed software outperforms existing ones in both computational time and accuracy by a large margin. The method and software are applied to the statistical emulation of high-resolution climate models. The second focus of this dissertation is the development of spatio-temporal stochastic weather generators for non-Gaussian and nonstationary processes. The proposed multi-site generator uses a left-censored non-Gaussian vector autoregression model, where the random error follows a skew-symmetric distribution. It not only drives the occurrence and intensity simultaneously but also possesses nice interpretations both physically and statistically. The generator is applied to 30-second precipitation data collected at the University of Lausanne. Finally, this dissertation investigates the spatial prediction with scalable deep learning algorithms to overcome the limitations of the classical Kriging predictor in geostatistics. A novel neural network structure is proposed for spatial prediction by adding an embedding layer of spatial coordinates with basis functions. The proposed method, called DeepKriging, has multiple advantages over Kriging and classical neural networks with spatial coordinates as features. The method is applied to the prediction of ne particulate matter (PM2:5) concentrations in the United States.
A Closer Look at Neighborhoods in Graph Based Point Cloud Scene Semantic Segmentation Networks(2020-11) [Thesis]
Advisor: Ghanem, Bernard
Committee members: Ghanem, Bernard; Al-Naffouri, Tareq Y.; Wonka, Peter; Thabet, Ali K.Large scale semantic segmentation is considered as one of the fundamental tasks in 3D scene understanding. Point clouds provide a basic and rich geometric rep- resentation of scenes and tangible objects. Convolutional Neural Networks (CNNs) have demonstrated an impressive success in processing regular discrete data such as 2D images and 1D audio. However, CNNs do not directly generalize to point cloud processing due to their irregular and un-ordered nature. One way to extend CNNs to point cloud understanding is to derive an intermediate euclidean representation of a point cloud by projecting onto image domain, voxelizing, or treating points as vertices of an un-directed graph. Graph-CNNs (GCNs) have demonstrated to be a very promising solution for deep learning on irregular data such as social networks, bi- ological systems, and recently point clouds. Early works in literature for graph based point networks relied on constructing dynamic graphs in the node feature space to define a convolution kernel. Later works constructed hierarchical static graphs in 3D space for an encoder-decoder framework inspired from image segmentation. This thesis takes a closer look at both dynamic and static graph neighborhoods of graph- based point networks for the task of semantic segmentation in order to: 1) discuss a potential cause for why going deep in dynamic GCNs does not necessarily lead to an improved performance, and 2) propose a new approach in treating points in a static graph neighborhood for an improved information aggregation. The proposed method leads to an efficient graph based 3D semantic segmentation network that is on par with current state-of-the-art methods on both indoor and outdoor scene semantic segmentation benchmarks such as S3DIS and Semantic3D.