For more information visit:

Collections in this community

Recent Submissions

  • Gaussian Blue Noise

    Ahmed, Abdalla G.M.; Ren, Jing; Wonka, Peter (ACM Transactions on Graphics, Association for Computing Machinery (ACM), 2022-11-30) [Article]
    Among the various approaches for producing point distributions with blue noise spectrum, we argue for an optimization framework using Gaussian kernels. We show that with a wise selection of optimization parameters, this approach attains unprecedented quality, provably surpassing the current state of the art attained by the optimal transport (BNOT) approach. Further, we show that our algorithm scales smoothly and feasibly to high dimensions while maintaining the same quality, realizing unprecedented high-quality high-dimensional blue noise sets. Finally, we show an extension to adaptive sampling.
  • Detection and Separation of Faults in Permanent Magnet Synchronous Machines using Hybrid Fault-Signatures

    Ullah, Zia; Im, JunHyuk; Ahmed, Shehab (IEEE, 2022-11-30) [Conference Paper]
    As digitalization in electric motors accelerates, online condition monitoring systems are becoming more popular, allowing unplanned downtime to be detected at its initial stage. Individual faults in motors are effectively diagnosed. However, due to identical signatures, fault separation and/or identification remain a challenge. This study presents the detection and separation of inter-turn short, demagnetization, static eccentricity, bearing, and the combination of these faults in permanent magnet synchronous machines. Hybrid fault signatures are used by monitoring the frequency spectrum of stator current, vibration, and induced voltage in the airgap. A planer-shaped airgap search coil (PASC) is employed to extract the induced voltage of each stator tooth. Faults-related anomalies in the induced-voltage, vibration, and the stator current caused are monitored. Any deviation in either signal is considered as generic fault indicator. Furthermore, specific fault features in all signals are used to classify these faults with improved accuracy. The PASC used in this study can also identify the location of the inter-turn short fault and the severity of demagnetization fault. The proposed method is verified using the finite element method simulation and experiments.
  • Enhanced Selectivity in the Electroproduction of H2O2 via F/S Dual-Doping in Metal-Free Nanofibers

    Xiang, Fei; Zhao, Xuhong; Yang, Jian; Li, Ning; Gong, Wenxiao; Liu, Yizhen; Burguete-Lopez, A.; Li, Yulan; Niu, Xiaobin; Fratalocchi, Andrea (Advanced Materials, Wiley, 2022-11-30) [Article]
    Electrocatalytic two-electron oxygen reduction (2e- ORR) to hydrogen peroxide (H2 O2 ) is attracting broad interest in diversified areas including paper manufacturing, wastewater treatment, production of liquid fuels, and public sanitation. Current efforts focus on researching low-cost, large-scale, and sustainable electrocatalysts with high activity and selectivity. Here we engineer large-scale H2 O2 electrocatalysts based on metal-free carbon fibers with a fluorine and sulfur dual-doping strategy. Optimized samples yield with a high onset potential of 0.814 V versus reversible hydrogen electrode (RHE), an almost an ideal 2e- pathway selectivity of 99.1%, outperforming most of the recent reported carbon-based or metal-based electrocatalysts. First principle theoretical computations and experiments demonstrate that the intermolecular charge transfer coupled with electron spin redistribution from fluorine and sulfur dual-doping is the crucial factor contributing to the enhanced performances in 2e- ORR. This work opens the door to the design and implementation of scalable, earth-abundant, highly selective electrocatalysts for H2 O2 production and other catalytic fields of industrial interest.
  • Demagnetization Risk Assessment in a Dual Stator Permanent Magnet Vernier Machines

    Ullah, Zia; Siddiqi, Mudassir Raza; Ahmed, Shehab (IEEE, 2022-11-30) [Conference Paper]
    As the topologies of permanent magnet vernier machines (PMVM) is getting more complex such as dual rotor and its variants. The thermal, mechanical, and especially demagnetization concern increasing. In this paper, the demagnetization risk evaluation of three similar topologies of dual stator radial type PMVM is presented. Three recently published topologies: dual winding with rotor-yoke, dual winding without rotor-yoke, and single winding without yoke are selected. This design highly improved the torque density and reduced the overall volume. However, the permanent magnets (PMs) in these topologies are at huge risk of irreversible demagnetization. Furthermore, the overall performance of PM-type machines is incomprehensible without a detailed demagnetization analysis. Therefore, a comprehensive mechanical, thermal, and demagnetization analysis considering various operating points and temperatures is conducted to evaluate the risk of demagnetization in these topologies. Finally, some modification are made to optimize of these designs. All analyses are carried out using finite element analysis and co-simulation in ANSYS maxwell and mechanical.
  • Ultraviolet micro light-emitting diode and color-conversion for white-light communication

    Lu, Hang (2022-11-29) [Thesis]
    Advisor: Ooi, Boon S.
    Committee members: Ohkawa, Kazuhiro; Ng, Tien Khee; Mohammed, Omar F.
    Visible-light communication (VLC) has several advantages over the commonly used radio frequency (RF) spectrum, including high bandwidth and low crosstalk. These features have become of more significance, especially as the proliferation of wireless devices increases and causes spectrum crowding. The white light in VLC systems is typically obtained from blue/violet light-emitting diodes (LEDs) and phosphors partially converting blue light into longer wavelength colors spanning the visible-light band. One phosphor that is frequently used is cerium-doped yttrium aluminum garnet (YAG). However, YAG suffers from a low color-rendering index (CRI) and high correlated color temperature (CCT). Lead halide perovskites provide an alternative to YAG and have been extensively utilized for optoelectronic devices owing to their tunable bandgap and high photoluminescence quantum yield (PLQY). However, their drawbacks, e.g., lead toxicity and instability, hinder their widespread application. Herein, in order to take advantage of a high-performance lead-free tin-based halide perovskite phosphor that has a high absolute PLQY of near unity and a wide spectral emission ranging from 500 to 700 nm, we fabricated ultraviolet (UV) micro light-emitting diodes (micro-LEDs) with a peak wavelength at 365 nm to match the peak of the photoluminescence excitation (PLE) spectra of the material to obtain strong yellow-spectrum emission. Together with a blue LED, white light was obtained with a CRI of 84.9 and 4115-K CCT. Despite the long PL lifetime of the perovskite material, which is in the order of μs, a net data rate of 1.5 Mb/s was achieved using orthogonal frequency-division multiplexing (OFDM) with adaptive bit and power loading to take advantage of the exceptionally high PLQY of the phosphor to improve the data throughput of the VLC system using higher modulation orders. Furthermore, through improvements to the nanostructure of lead-free tin-based halide perovskite phosphor and the use of excitation sources with a higher power, the data rate is expected to be even higher. The lead-free nature of this material, along with its wide spectrum and high conversion efficiency, makes it a promising alternative to conventional toxic perovskite-based phosphors. As the first demonstration of VLC links using lead-free perovskite, this study paves the way for safer, more sustainable VLC systems.
  • CDAnet: A Physics-Informed Deep Neural Network for Downscaling Fluid Flows

    Hammoud, Mohamad Abed ElRahman; Titi, Edriss S.; Hoteit, Ibrahim; Knio, Omar (Journal of Advances in Modeling Earth Systems, American Geophysical Union (AGU), 2022-11-29) [Article]
    Generating high-resolution flow fields is of paramount importance for various applications in engineering and climate sciences. This is typically achieved by solving the governing dynamical equations on high-resolution meshes, suitably nudged towards available coarse-scale data. To alleviate the computational cost of such downscaling process, we develop a physics-informed deep neural network (PI-DNN) that mimics the mapping of coarse-scale information into their fine-scale counterparts of continuous data assimilation (CDA). Specifically, the PI-DNN is trained within the theoretical framework described by Foias et al. (2014) to generate a surrogate of the theorized determining form map from the coarse-resolution data to the fine-resolution solution. We demonstrate the PI-DNN methodology through application to 2D Rayleigh-Bénard convection, and assess its performance by contrasting its predictions against those obtained by dynamical downscaling using CDA. The analysis suggests that the surrogate is constrained by similar conditions, in terms of spatio-temporal resolution of the input, as the ones required by the theoretical determining form map. The numerical results also suggest that the surrogate’s downscaled fields are of comparable accuracy to those obtained by dynamically downscaling using CDA. Consistent with the analysis of Farhat, Jolly, and Titi (2015), temperature observations are not needed for the PI-DNN to predict the fine-scale velocity, pressure and temperature fields.
  • mpi4py.futures: MPI-based asynchronous task execution for Python

    Rogowski, Marcin; Aseeri, Samar A.; Keyes, David E.; Dalcin, Lisandro (IEEE Transactions on Parallel and Distributed Systems, IEEE, 2022-11-29) [Article]
    We present mpi4py.futures, a lightweight, asynchronous task execution framework targeting the Python programming language and using the Message Passing Interface (MPI) for interprocess communication. mpi4py.futures follows the interface of the concurrent.futures package from the Python standard library and can be used as its drop-in replacement, while allowing applications to scale over multiple compute nodes. We discuss the design, implementation, and feature set of mpi4py.futures and compare its performance to other solutions on both shared and distributed memory architectures. On a shared-memory system, we show mpi4py.futures to consistently outperform Python's concurrent.futures with speedup ratios between 1.4X and 3.7X in throughput (tasks per second) and between 1.9X and 2.9X in bandwidth. On a Cray XC40 system, we compare mpi4py.futures to Dask – a well-known Python parallel computing package. Although we note more varied results, we show mpi4py.futures to outperform Dask in most scenarios.
  • Remote Monitoring of Skin Temperature through a Wristband Employing a Printed VO2 Sensor

    Fatani, Firas; Vaseem, M.; Akhter, Zubair; Bilal, Rana Muhammad; Shamim, Atif (IEEE Sensors Journal, IEEE, 2022-11-29) [Article]
    The need for highly sensitive, environmentally stable, mechanically flexible, and low-cost temperature sensors for on-body measurements has been increasing with the wide adoption of personal Healthcare-Internet-of-Things (H-IoT) devices. Printed electronics (PE) is a good platform for such sensors because it enables the realization of flexible devices through simple and rapid methods at a relatively low cost. However, previously reported printed temperature sensors suffer from poor sensitivity and/or environmental instability. In this paper, we report a custom Tungsten (W)-doped Vanadium Dioxide (VO2) ink-based screen-printed temperature sensor having the highest Temperature-Coefficient-of-Resistance (TCR) of 2.78%∙°C-1 with a resolution of 0.1°C between 30°C and 40°C. To protect it from environmental effects, a fluoropolymer-based passivation layer is added for accurate temperature readings even in 90% relative humidity. The sensor is printed on a flexible substrate and shows minimal deterioration in performance over 1000 bending cycles. For wearability and remote monitoring, the sensor is integrated with a custom Bluetooth Low Energy (BLE) wireless readout in the form of a wristband. The BLE readout comprises an ultra-thin and flexible patch antenna optimized for both BLE bandwidth and human wearability. It demonstrates a minimal SAR value of only 0.068W/kg, making it safe to wear. Despite the antenna’s thin structure (0.004λ), it has a gain of 1.65dBi, enabling an excellent communication range. The proposed wristband is tested on ten volunteers and under daily activities, which shows promising results with a maximum error of 0.16°C with reference to those of a commercial thermometer.
  • Computational Network Analysis of Host GeneticRisk Variants of Severe COVID-19

    Alsaedi, Sakhaa B.; Mineta, Katsuhiko; Gao, Xin; Gojobori, Takashi (Research Square Platform LLC, 2022-11-29) [Preprint]
    Background: Genome-wide association studies have identified numerous human host genetic risk variants that play a substantial role in the host immune response to SARS-CoV-2. Although these genetic risk variants significantly increase the severity of COVID-19, their influence on body systems is poorly understood. Therefore, we aim to interpret the biological mechanisms and pathways associated with the genetic risk factors and immune responses in severe COVID-19. We perform a deep analysis of previously identified risk variants and infer the hidden interactions between their molecular networks through disease mapping and the similarity of the molecular functions between constructed networks. Results: We designed a four-stage computational workflow for systematic genetic analysis of the risk variants. We integrated the molecular profiles of the risk factors with associated diseases, then constructed protein-protein interaction networks. We identified 24 protein-protein interaction networks with 939 interactions derived from 109 filtered risk variants in 60 risk genes and 56 proteins. The majority of molecular functions, interactions and pathways are involved in immune responses; several interactions and pathways are related to the metabolic and cardiovascular systems, which could lead to multi-organ complications and dysfunction. Conclusions: This study highlights the importance of analyzing molecular interactions and pathways to understand the heterogeneous susceptibility of the host immune response to SARS-CoV-2. We propose new insights into pathogenicity analysis of infections by including genetic risk information as essential factors to predict future complications during and after infection. This approach may assist more precise clinical decisions and accurate treatment plans to reduce COVID-19 complications.
  • EEG-Based Mental Tasks Recognition via a Deep Learning-Driven Anomaly Detector

    Dairi, Abdelkader; Zerrouki, Nabil; Harrou, Fouzi; Sun, Ying (Diagnostics, MDPI AG, 2022-11-29) [Article]
    This paper introduces an unsupervised deep learning-driven scheme for mental tasks’ recognition using EEG signals. To this end, the Multichannel Wiener filter was first applied to EEG signals as an artifact removal algorithm to achieve robust recognition. Then, a quadratic time-frequency distribution (QTFD) was applied to extract effective time-frequency signal representation of the EEG signals and catch the EEG signals’ spectral variations over time to improve the recognition of mental tasks. The QTFD time-frequency features are employed as input for the proposed deep belief network (DBN)-driven Isolation Forest (iF) scheme to classify the EEG signals. Indeed, a single DBN-based iF detector is constructed based on each class’s training data, with the class’s samples as inliers and all other samples as anomalies (i.e., one-vs.-rest). The DBN is considered to learn pertinent information without assumptions on the data distribution, and the iF scheme is used for data discrimination. This approach is assessed using experimental data comprising five mental tasks from a publicly available database from the Graz University of Technology. Compared to the DBN-based Elliptical Envelope, Local Outlier Factor, and state-of-the-art EEG-based classification methods, the proposed DBN-based iF detector offers superior discrimination performance of mental tasks.
  • Global Depths for Irregularly Observed Multivariate Functional Data

    Qu, Zhuo; Dai, Wenlin; Genton, Marc G. (arXiv, 2022-11-28) [Preprint]
    Two frameworks for multivariate functional depth based on multivariate depths are introduced in this paper. The first framework is multivariate functional integrated depth, and the second framework involves multivariate functional extremal depth, which is an extension of the extremal depth for univariate functional data. In each framework, global and local multivariate functional depths are proposed. The properties of population multivariate functional depths and consistency of finite sample depths to their population versions are established. In addition, finite sample depths under irregularly observed time grids are estimated. As a by-product, the simplified sparse functional boxplot and simplified intensity sparse functional boxplot are proposed for visualization without data reconstruction. A simulation study demonstrates the advantages of global multivariate functional depths over local multivariate functional depths in outlier detection and running time for big functional data. An application of our frameworks to cyclone tracks data demonstrates the excellent performance of our global multivariate functional depths.
  • Micro-Optics 3D Printed via Multi-Photon Laser Lithography

    Gonzalez-Hernandez, Diana; Varapnickas, Simonas; Bertoncini, Andrea; Liberale, Carlo; Malinauskas, Mangirdas (Advanced Optical Materials, Wiley, 2022-11-27) [Article]
    The field of 3D micro-optics is rapidly expanding, and essential advances in femtosecond laser direct-write 3D multi-photon lithography (MPL, also known as two-photon or multi-photon polymerization) are being made. Micro-optics realized via MPL emerged a decade ago and the field has exploded during the last five years. Impressive findings have revealed its potential for beam shaping, advanced imaging, optical sensing, integrated photonic circuits, and much more. This is supported by a game-changing and increasing industrial interest from key established companies in this field. In this review, the origin and the advancement of micro-optics fabrication with MPL are detailed by describing the chronology, distinguishing discrete application groups, providing generalized technical data on the processes and available materials, and discussing the foreseen near-future advances.
  • Multi-Modal Few-Shot Temporal Action Detection via Vision-Language Meta-Adaptation

    Nag, Sauradip; Xu, Mengmeng; Zhu, Xiatian; Perez-Rua, Juan-Manuel; Ghanem, Bernard; Song, Yi-Zhe; Xiang, Tao (arXiv, 2022-11-27) [Preprint]
    Few-shot (FS) and zero-shot (ZS) learning are two different approaches for scaling temporal action detection (TAD) to new classes. The former adapts a pretrained vision model to a new task represented by as few as a single video per class, whilst the latter requires no training examples by exploiting a semantic description of the new class. In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly. To tackle this problem, we further introduce a novel MUlti-modality PromPt mETa-learning (MUPPET) method. This is enabled by efficiently bridging pretrained vision and language models whilst maximally reusing already learned capacity. Concretely, we construct multi-modal prompts by mapping support videos into the textual token space of a vision-language model using a meta-learned adapter-equipped visual semantics tokenizer. To tackle large intra-class variation, we further design a query feature regulation scheme. Extensive experiments on ActivityNetv1.3 and THUMOS14 demonstrate that our MUPPET outperforms state-of-the-art alternative methods, often by a large margin. We also show that our MUPPET can be easily extended to tackle the few-shot object detection problem and again achieves the state-of-the-art performance on MS-COCO dataset.
  • Ultraliser: a framework for creating multiscale, high-fidelity and geometrically realistic 3D models for in silico neuroscience

    Abdellah, Marwan; Cantero, Juan José García; Guerrero, Nadir Román; Foni, Alessandro; Coggan, Jay S; Cali, Corrado; Agus, Marco; Zisis, Eleftherios; Keller, Daniel; Hadwiger, Markus; Magistretti, Pierre J.; Markram, Henry; Schürmann, Felix (Briefings in Bioinformatics, Oxford University Press (OUP), 2022-11-26) [Article]
    Ultraliser is a neuroscience-specific software framework capable of creating accurate and biologically realistic 3D models of complex neuroscientific structures at intracellular (e.g. mitochondria and endoplasmic reticula), cellular (e.g. neurons and glia) and even multicellular scales of resolution (e.g. cerebral vasculature and minicolumns). Resulting models are exported as triangulated surface meshes and annotated volumes for multiple applications in in silico neuroscience, allowing scalable supercomputer simulations that can unravel intricate cellular structure–function relationships. Ultraliser implements a high-performance and unconditionally robust voxelization engine adapted to create optimized watertight surface meshes and annotated voxel grids from arbitrary non-watertight triangular soups, digitized morphological skeletons or binary volumetric masks. The framework represents a major leap forward in simulation-based neuroscience, making it possible to employ high-resolution 3D structural models for quantification of surface areas and volumes, which are of the utmost importance for cellular and system simulations. The power of Ultraliser is demonstrated with several use cases in which hundreds of models are created for potential application in diverse types of simulations. Ultraliser is publicly released under the GNU GPL3 license on GitHub (BlueBrain/Ultraliser). Significance: There is crystal clear evidence on the impact of cell shape on its signaling mechanisms. Structural models can therefore be insightful to realize the function; the more realistic the structure can be, the further we get insights into the function. Creating realistic structural models from existing ones is challenging, particularly when needed for detailed subcellular simulations. We present Ultraliser, a neuroscience-dedicated framework capable of building these structural models with realistic and detailed cellular geometries that can be used for simulations.
  • Very High-Order A-stable Stiffly Accurate Diagonally Implicit Runge-Kutta Methods

    Alamri, Yousef; Ketcheson, David I. (arXiv, 2022-11-26) [Preprint]
    A numerical search approach is used to design high-order diagonally implicit Runge-Kutta (DIRK) schemes suitable for stiff and oscillatory systems. We present new A-stable schemes of orders six (the highest order of previously designed DIRK schemes) up to eight. For each order, we include one scheme that is only A-stable as well as one that is stiffly accurate and therefore L-stable. The stiffly accurate schemes require more stages but can be expected to give better results for highly stiff problems and differential-algebraic equations. The development of eighth-order schemes requires the highly accurate numerical solution of a system of 200 equations in over 100 variables, which is accomplished via a combination of global and local optimization. The accuracy and stability of the schemes is analyzed and tested on diverse problems.
  • Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding

    Bakr, Eslam Mohamed; Alsaedy, Yasmeen; Elhoseiny, Mohamed (arXiv, 2022-11-25) [Preprint]
    The 3D visual grounding task has been explored with visual and language streams comprehending referential language to identify target objects in 3D scenes. However, most existing methods devote the visual stream to capturing the 3D visual clues using off-the-shelf point clouds encoders. The main question we address in this paper is "can we consolidate the 3D visual stream by 2D clues synthesized from point clouds and efficiently utilize them in training and testing?". The main idea is to assist the 3D encoder by incorporating rich 2D object representations without requiring extra 2D inputs. To this end, we leverage 2D clues, synthetically generated from 3D point clouds, and empirically show their aptitude to boost the quality of the learned visual representations. We validate our approach through comprehensive experiments on Nr3D, Sr3D, and ScanRefer datasets and show consistent performance gains compared to existing methods. Our proposed module, dubbed as Look Around and Refer (LAR), significantly outperforms the state-of-the-art 3D visual grounding techniques on three benchmarks, i.e., Nr3D, Sr3D, and ScanRefer.
  • Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

    Zhao, Chen; Liu, Shuming; Mangalam, Karttikeya; Ghanem, Bernard (arXiv, 2022-11-25) [Preprint]
    Temporal action localization (TAL) requires long-form reasoning to predict actions of various lengths and complex content. Given limited GPU memory, training TAL end-to-end on such long-form videos (i.e., from videos to predictions) is a significant challenge. Most methods can only train on pre-extracted features without optimizing them for the localization problem, consequently limiting localization performance. In this work, to extend the potential in TAL networks, we propose a novel end-to-end method Re2TAL, which rewires pretrained video backbones for reversible TAL. Re2TAL builds a backbone with reversible modules, where the input can be recovered from the output such that the bulky intermediate activations can be cleared from memory during training. Instead of designing one single type of reversible module, we propose a network rewiring mechanism, to transform any module with a residual connection to a reversible module without changing any parameters. This provides two benefits: (1) a large variety of reversible networks are easily obtained from existing and even future model designs, and (2) the reversible models require much less training effort as they reuse the pre-trained parameters of their original non-reversible versions. Re2TAL reaches 37.01% average mAP, a new state-of-the-art record on ActivityNet-v1.3, and mAP 64.9% at tIoU=0.5 on THUMOS-14 without using optimal flow.
  • Incorporating Pd into Cu-Coordinated Metal-Organic Frameworks to Promote N2 Electrochemical Reduction into Ammonia

    Huang, Hao; Liu, Qiaoxi; Cheng, Qingpeng; Zhang, Maolin; Liu, Jialei (ChemCatChem, Wiley, 2022-11-24) [Article]
    Fixation nitrogen using renewable energy has attracted much attention recently. However, the performance is limited by the competing hydrogen evolution reaction (HER) and the difficulty in activating N2. Here, an attractive strategy was proposed to enhance the electrochemical nitrogen reduction reaction (NRR) activity by introducing Pd into the HKUST-1 framework. After thermal treatment, the obtained Pd/HKUST-1 (250 °C) catalyst exhibited an ammonia production rate of 42.0 mg/gcat ⋅ h at −0.4 V vs. RHE with Faradaic efficiency of 4.6 %. The ammonia production reached as high as 415 mg/gcat in a 10-hours stability test. With the assistance of density functional theory (DFT) calculations, the incorporated Pd was revealed to have the unique property to react with adsorbed H (Had) atom from HER and form α-PdH species. Compared with other metals such as Ag, Au, and Pt, the in situ formed α-PdH species could reduce the energy barrier of the rate-limiting *N2H step, resulting in an enhanced NRR activity.
  • On Narrative Information and the Distillation of Stories

    Ashley, Dylan R.; Herrmann, Vincent; Friggstad, Zachary; Schmidhuber, Juergen (arXiv, 2022-11-22) [Preprint]
    The act of telling stories is a fundamental part of what it means to be human. This work introduces the concept of narrative information, which we define to be the overlap in information space between a story and the items that compose the story. Using contrastive learning methods, we show how modern artificial neural networks can be leveraged to distill stories and extract a representation of the narrative information. We then demonstrate how evolutionary algorithms can leverage this to extract a set of narrative templates and how these templates -- in tandem with a novel curve-fitting algorithm we introduce -- can reorder music albums to automatically induce stories in them. In the process of doing so, we give strong statistical evidence that these narrative information templates are present in existing albums. While we experiment only with music albums here, the premises of our work extend to any form of (largely) independent media.
  • A neuron-wise subspace correction method for the finite neuron method

    Park, Jongho; Xu, Jinchao; Xu, Xiaofeng (arXiv, 2022-11-22) [Preprint]
    In this paper, we propose a novel algorithm called Neuron-wise Parallel Subspace Correction Method (NPSC) for training ReLU neural networks for numerical solution of partial differential equations (PDEs). Despite of extremely extensive research activities in applying neural networks for numerical PDEs, there is still a serious lack of training algorithms that can be used to obtain approximation with adequate accuracy. Based on recent results on the spectral properties of linear layers and landscape analysis for single neuron problems, we develop a special type of subspace correction method that deals with the linear layer and each neuron in the nonlinear layer separately. An optimal preconditioner that resolves the ill-conditioning of the linear layer is presented, so that the linear layer is trained in a uniform number of iterations with respect to the number of neurons. In each single neuron problem, a good local minimum is found by a superlinearly convergent algorithm, avoiding regions where the loss function is flat. Performance of the proposed method is demonstrated through numerical experiments for function approximation problems and PDEs.

View more