Studying Perturbations on the Input of Two-Layer Neural Networks with ReLU Activation
Permanent link to this recordhttp://hdl.handle.net/10754/655886
MetadataShow full item record
AbstractNeural networks was shown to be very susceptible to small and imperceptible perturbations on its input. In this thesis, we study perturbations on two-layer piecewise linear networks. Such studies are essential in training neural networks that are robust to noisy input. One type of perturbations we consider is `1 norm bounded perturbations. Training Deep Neural Networks (DNNs) that are robust to norm bounded perturbations, or adversarial attacks, remains an elusive problem. While verification based methods are generally too expensive to robustly train large networks, it was demonstrated in  that bounded input intervals can be inexpensively propagated per layer through large networks. This interval bound propagation (IBP) approach lead to high robustness and was the first to be employed on large networks. However, due to the very loose nature of the IBP bounds, particularly for large networks, the required training procedure is complex and involved. In this work, we closely examine the bounds of a block of layers composed of an affine layer followed by a ReLU nonlinearity followed by another affine layer. In doing so, we propose probabilistic bounds, true bounds with overwhelming probability, that are provably tighter than IBP bounds in expectation. We then extend this result to deeper networks through blockwise propagation and show that we can achieve orders of magnitudes tighter bounds compared to IBP. With such tight bounds, we demonstrate that a simple standard training procedure can achieve the best robustness-accuracy tradeoff across several architectures on both MNIST and CIFAR10. We, also, consider Gaussian perturbations, where we build on a previous work that derives the first and second output moments of a two-layer piecewise linear network . In this work, we derive an exact expression for the second moment, by dropping the zero mean assumption in .
Showing items related by title, author, creator and subject.
DANNP: an efficient artificial neural network pruning toolAlShahrani, Mona; Soufan, Othman; Magana-Mora, Arturo; Bajic, Vladimir B. (PeerJ Computer Science, PeerJ, 2017-11-06) [Article]Background Artificial neural networks (ANNs) are a robust class of machine learning models and are a frequent choice for solving classification problems. However, determining the structure of the ANNs is not trivial as a large number of weights (connection links) may lead to overfitting the training data. Although several ANN pruning algorithms have been proposed for the simplification of ANNs, these algorithms are not able to efficiently cope with intricate ANN structures required for complex classification problems. Methods We developed DANNP, a web-based tool, that implements parallelized versions of several ANN pruning algorithms. The DANNP tool uses a modified version of the Fast Compressed Neural Network software implemented in C++ to considerably enhance the running time of the ANN pruning algorithms we implemented. In addition to the performance evaluation of the pruned ANNs, we systematically compared the set of features that remained in the pruned ANN with those obtained by different state-of-the-art feature selection (FS) methods. Results Although the ANN pruning algorithms are not entirely parallelizable, DANNP was able to speed up the ANN pruning up to eight times on a 32-core machine, compared to the serial implementations. To assess the impact of the ANN pruning by DANNP tool, we used 16 datasets from different domains. In eight out of the 16 datasets, DANNP significantly reduced the number of weights by 70%–99%, while maintaining a competitive or better model performance compared to the unpruned ANN. Finally, we used a naïve Bayes classifier derived with the features selected as a byproduct of the ANN pruning and demonstrated that its accuracy is comparable to those obtained by the classifiers trained with the features selected by several state-of-the-art FS methods. The FS ranking methodology proposed in this study allows the users to identify the most discriminant features of the problem at hand. To the best of our knowledge, DANNP (publicly available at www.cbrc.kaust.edu.sa/dannp) is the only available and on-line accessible tool that provides multiple parallelized ANN pruning options. Datasets and DANNP code can be obtained at www.cbrc.kaust.edu.sa/dannp/data.php and https://doi.org/10.5281/zenodo.1001086.
A robust neural network-based approach for microseismic event detectionAkram, Jubran; Ovcharenko, Oleg; Peter, Daniel (SEG Technical Program Expanded Abstracts 2017, Society of Exploration Geophysicists, 2017-08-17) [Conference Paper]We present an artificial neural network based approach for robust event detection from low S/N waveforms. We use a feed-forward network with a single hidden layer that is tuned on a training dataset and later applied on the entire example dataset for event detection. The input features used include the average of absolute amplitudes, variance, energy-ratio and polarization rectilinearity. These features are calculated in a moving-window of same length for the entire waveform. The output is set as a user-specified relative probability curve, which provides a robust way of distinguishing between weak and strong events. An optimal network is selected by studying the weight-based saliency and effect of number of neurons on the predicted results. Using synthetic data examples, we demonstrate that this approach is effective in detecting weaker events and reduces the number of false positives.