Saudi Summer Internship Program (SSI) 2021
Recent Submissions
-
Porting Exageostat Software From C To C++(2021-08-19) [Poster]PORTING EXAGEOSTAT SOFTWARE FROM C TO C++ -MOTIVATION AND GOALS • ExaGeoStat is a parallel high-performance unified framework for computational geostatistics on many-core systems for geospatial statistics in climate and environment modeling. • ExaGeoStat employs a statistical model based on the evaluation of the Gaussian log-likelihood function, which operates on a large dense covariance matrix. Generated by the parametrizable Matérn covariance function. • ExaGeoStat was developed in C, which has several practical limitations and difficulties. For instance, C follows the procedural programming approach with no support of the concept of object-oriented programming (OOP). This unnecessarily complicates the coding and debugging processes when ExaGeoStat is extended. • C++ is an OOP language that is more suitable in large applications due to its multi-paradigm structure that allows splitting the software into several components. It provides more flexibility to separately maintain each component through a set of predefined abstraction levels. • In this work, we propose porting the existing C version of ExaGeoStat to C++, supporting the multi-precision computation in ExaGeoStat using the concept of the C++ templates. The templates help in generating optimized code for different numeric precisions at runtime. • We also used the C++ lambda expressions to provide an easier way to add new model kernels to the C++ version or even through the C++ version of the software or the R version of ExaGeoStat (i.e., ExaGeoStatR). • We assessed the accuracy and the performance of the new C++ code through a set of experiments on different parallel computing hardware. We validated the correctness from the statistical perspective and demonstrate its high performance
-
Efficient Forecasting Of Wind Power Using Machine Learning Methods.(2021-08-19) [Poster]Efficient forecasting of wind power using machine learning methods I. Introduction Wind power, one of the most promising green energies, has been developed rapidly globally, and its pro- portion in the power grid has been increasing. The main crucial and challenging issue in wind power production is its intermittent volatility due to weather conditions, making it not easy to integrate into the power grid. Precise forecasting of wind power generation is crucial to mitigate the challenges of balancing supply and demand in the smart grid. This study investigates the feasibility of different machine Learning to predict and forecast wind power. Actual measurements recorded every 10 minutes from three actual wind turbines are used to demonstrate the prediction precision of the investigated techniques. II.Methodology Datasets were divided into two subsets: a training set and a testing set. We train the investigated machine learning models using the training set. We used a 5-fold cross-validation technique to train the models. 23 machine learning methods are investigated to forecast and predict wind power, including SVR, GPR, Bagged trees, Boosted trees, and Random forest. Bayesian optimization is employed to determine the values of the hyperparameters in the considered models III. Datasets Three wind power datasets are used in this study: France, Turkey, and Kaggle datasets. France Dataset contains 16 features collected in 2017: 21524 train records and 433 test records. Turkey Dataset comprises two features, wind speed, and wind direction, containing 21495 train records and 433 test records collected in 2018. Kaggle Dataset contains 20 features, 12333 train records, and 451 test records collected in 2021. VI. Conclusion Results indicate the promising performance of the ensemble Trees-based models We plan to apply deep learning methods to increase the prediction accuracy The investigated models in this study can represent a helpful tool for model-based anomaly detection in wind turbines.
-
Differential Privacy In Generalized Eigenvalue Problem (Gep)(2021-08-19) [Poster]With the acceleration of development in technology, the significance of data andits analysis increases. While The protection of the privacy of data providers is crucial the organizations rely heavily on data analysis using artificial intelligence algorithms to benefit from them economically, politically and socially ...etc. In big data analysis, one of the most important aspects is dimensionality reduction but how we can provide a new concept of privacy in this area? Differential privacy is a mathematical framework that anonymizes data and acts in a privacy-preserving manner with a large scope of future development.
-
Neural Network Pruning Vs Neural Architectural Search For Finding Size Constrained Models(2021-08-19) [Poster]Current Neural Networks (NN) are large and over-parameterized} which leads to the following issues:They require resources, money, and time to train and store.Have a huge carbon footprintHard to use in constraint environments.NN do many computations to run, which takes time. Neural Network Pruning: is the process by which we zero out (prune) a portion of a trained neural network weights using a type of algorithm. This leads to less storage requirements for the neural network, and less computations when using the neural network. Neural Architecture Search: is the process of finding neural network architecture without a human design. This leads to finding architectures that do well with small number of parameters, which means that they are inherently efficient.
-
Evaluating Different Network Parameters And Features Mapping In Multilayer Perceptrons For Computed Tomography(2021-08-19) [Poster]Introduction and Motivation •Deep learning models have been increasingly utilized to reconstruct computed tomography (CT) projections. •At the heart of some of these models is a basic multilayer perceptron network (MLP). •Basic MLPs, however, struggle to learn high frequencies in low dimensions input problems. •The resulted pictures lack details and are very blurry (Figure 1 a). •We evaluated different approaches to make those high frequencies learnable. Method •We set up a baseline MLP and choose the values of its parameters. •The model was trained repetitively while tweaking one parameter in each run. •In total, more than 9 parameters were experimented with. •The loss function used for training was the least square error (L2). •The peak signal-to-noise ratio (PSNR) was used to quantify the predicted picture quality. •Other input images (bird, cat, shapes) were also used to evaluate any potential differences in prediction between a monochromatic CT scan image and a colored picture of an object. Model Architecture The baseline network was configured as follows: •Input: X and Y pixel coordinates. •Features Mapping Layer (to be included or excluded later, Table1). •Hidden layers: 3. •Output: RGB values for each pixel Findings and Observations •The MLP with Fourier features mapping layer1 outperformed the simple MLP, both quantitatively and qualitatively (Figure1 and Table1) •Small learning rates produced a lot of noise, and therefore low PSNR (Figure 3). •A higher number of hidden layers (12) produced slightly higher PSNR, but almost no observable visual difference in the CT scan picture. •Using a uniform distribution in the features mapping layer produced significantly lower PSNR and hazy pictures. Conclusion •While MLPs do struggle to learn high frequencies, implementing a features mapping layer could rectify that and make those frequencies learnable. •Different parameters could yield different results if the input image is different. What works for monochromatic CT images may not work with other inputs as effectively. •An additional quantitative method has to be considered for future work, like the structural similarity index measure (SSIM).
-
Trace Compression(2021-08-19) [Poster]Problem statementMethodologyReferencesFurther researchConclusionExperimental setupExperimental evaluationHow is it implementedObjective• To study and analyze the performance of a CPU in the most proper way, long memory address traces are needed to give the most accurate result. However, more time and memory space will be required to have the analysis• We need to enhance the performance of the applied methodology and reduce the size of the trace compared with the traditional gzip tool• Loop Detection and Reduction:•Dinero FormatA popular format used in memory addresses00.10.20.30.40.50.60.70.80.91Trace 1 Trace 2 Trace 3Original gzip LDR gzip on LDR• The study resulted in a ratio of up to 22 between gzip and gzip on LDR• The Technique will show more compression ratio if it is applied on a trace contains a considerable number of loops• Since the size of the outputted trace depends on its content, the documented addresses (fetch and data addresses) could be not fully written, instead, using smaller data set to reference to the information that most be needed to decompress the outputted trace fileFigure 1. Categories of loop addresses 45.4951.37 121.1940.4884.33 337.3310.7452.57245.335KAddresses• Elnozahy, E. Address Trace Compression Through Loop Detection and Reduction• Stack Overflow. Retrieved from https://stackoverflow.com/• Java67. Retrieved from https://www.java67.com/2015/01/how-to-sorthashmap-in-java-based-on.htmlTrace Compression Tariq HommadiSupervised by Elmootazbellah N. ElnozahyComputer, Electrical and Mathematical Sciences and Engineering Division (CEMSE)King Fahd University of Petroleum & Minerals (KFUPM), Dhahran, Saudi ArabiaKing Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia for(i=0; i<n; i++)ai = 100bi = ci*4 Figure 2. Activity diagram of the program 100KAddresses300KAddresses
-
Nonlinear Integer Programming Formulation Of The Inverse Photonic Design Problem(2021-08-19) [Poster]
-
Importance Sampling In Option Pricing(2021-08-19) [Poster]Importance Sampling in Option Pricing In this poster we compare between the importance sampling algorithms and the standard Monte Carlo simulation in pricing financial options. We will show how importance sampling can reduce the variance of the estimator used. For simplicity, we will only consider only the one-dimensional case
-
Simultaneous Multi-Camera Localization And Mapping With Apriltags (Tagslam) And Trajectory Planning For Underwater Rov(2021-08-19) [Poster]Over 70% of the Earth’s surface is covered by oceans, yet less than 5% has been explored due to the dangerous and inaccessible marine environment. Remotely operated vehicles (ROVs) allow marine scientists to explore the ocean without having to be in the ocean. The goal of this project is to allow the robot to localize itself within its environment, map its surroundings, and follow a desired path (trajectory planning). Vision System: In order to process images and record captures: 1.An image of Ubuntu MATE 18.04 was flashed to the ROV’s Raspberry Pi 3. 2.Ethernet connection between the topside computer (Ubuntu 18.04) and the Raspberry Pi was established. 3.ROS and usb_cam packages were installed. TagSLAM: To map the robot’s surrounding environment and determine its position: 1.tagslam_root packages were installed. 2.Extrinsic camera calibration was performed. 3.april_tag_detector and tagslam nodes were launched. 4.Apriltags of Tag family 36h11 were printed. 5.Odometry and camera images were published into TagSLAM. ROV Thrusters: In order to control the robot under the water and test TagSLAM: 1.Connection between the autopilot (Pixhawk) and the companion computer (Raspberry Pi 3) was established. 2.MAVROS was installed and run in the companion computer. 3.Vehicle’s mode was set to MANUAL and Failsafes in QGC were disabled. 4.Parameters were sent to actuate the thrusters using the overrideRCIn topic. Pose Estimation: To get the distance between the camera frame and an apriltag: 1.A python script that reads sensor data was written. 2.The node was run to output the ROV’s position and orientation in x, y, z. To install ROS on the companion computer, different versions of Ubuntu were flashed with the Pi image. Ubuntu MATE 18.04 was found to be the most compatible. The CSI camera on the ROV was replaced with a fisheye USB camera, which was successfully calibrated. Data of captured images and recorded videos are saved in a rosbag for later use. Apriltags were detected by the ROV and mapped on Rviz as a body_rig. Localization was achieved using the camera perception frame view. ROV’s six thrusters were successfully controlled using MAVROS nodes and topics. ROV is able to move and dive underneath the water surface. Implement PID controller for the ROV to keep a desired distance and overcome pose offset error. Apply a fractional-order control algorithm to run the robot more robustly. Automate the ROV to perform trajectory planning.
-
Wireless Underwater Monitoring System For Coral Reefs(2021-08-19) [Poster]Underwater Monitoring System for the Coral Reef The Internet of Underwater ThingsIoUT ) is a network of smartinterconnected technologies to observeunderwater activities. A self powered underwater monitoringsystem is designed to collect data onthe quality and the temperature of thewater.
-
Streaming Opengl On Hololens2(2021-08-19) [Poster]
-
Automated Landform Detection On Mars Using Convolutional Neural Networks(2021-08-19) [Poster]As the neighbor of Earth in the Solar System, Mars has been recognized as an important reference for investigating the evolution of history and future of Earth due to the similar rocky structure, water, thin atmosphere, earth-like elements and small molecule organic matter all been detected on Mars, which makes it a perfect candidate for our first interstellar settlement. Utilizing the large volume of public, high-resolution images of its surfaces we develop fully automated Deep Learning algorithms that serve exploring the planet. Convolutional Neural Networks(ConvNets) simulates human nerves. Through training, it operates as a feature extractor that detects and investigates various geological landforms on the surface of Mars; Its history and its resource reservoirs.
-
Air To Air Drone Detection System Using Modular Radar(2021-08-19) [Poster]
-
Forecasting Covid-19 Time-Series Data Using Machine Learning-Driven Methods(2021-08-19) [Poster]Forecasting COVID-19 Time-Series Data using Machine Learning-Driven Methods Yasminah Alali, Fouzi Harrou , Ying Sun World Health Organization reported around 200 million confirmed COVID-19 cases and 4 million death cases worldwide. Accurate forecasting of the number of new contaminated and recovered cases is crucial for optimizing the available resources and arresting or control the progression of such diseases. We investigate fifteen powerful machine learning methods, including SVR with different kernels, GPR with different kernels, Boosted trees, and Bagged trees, to forecast confirmed and recovered COVID-19 cases from India and Brazil dataset. Results demonstrate the promising of machine learning models in forecasting COVID-19 cases and highlight the performance of the optimized GPR compared to the other algorithms. This study also reveals the importance of incorporating information from past observations to improve forecasting accuracy.
-
Handover Management Of An Aerial User In Integrated Vertical Heterogeneous Networks(2021-08-19) [Poster]