An introduction to distributed training of deep neural networks for segmentation tasks with large seismic datasets
Name:
An introduction_geo2021-0130.1.pdf
Size:
4.622Mb
Format:
PDF
Description:
Accepted manuscript
Type
ArticleKAUST Department
Ali I. Al-Naimi Petroleum Engineering Research Center (ANPERC)Formerly Equinor ASA, Sandslivegen 90, Sandsli 5254, Norway; presently King Abdullah University of Science and Technology, Thuwal 7252, Saudi Arabia..
Physical Science and Engineering (PSE) Division
Date
2021-08-24Preprint Posting Date
2021-02-25Submitted Date
2021-02-25Permanent link to this record
http://hdl.handle.net/10754/667824
Metadata
Show full item recordAbstract
Deep learning applications are drastically progressing in seismic processing and interpretation tasks. However, the majority of approaches subsample data volumes and restrict model sizes to minimise computational requirements. Subsampling the data risks losing vital spatio-temporal information which could aid training whilst restricting model sizes can impact model performance, or in some extreme cases, renders more complicated tasks such as segmentation impossible. This paper illustrates how to tackle the two main issues of training of large neural networks: memory limitations and impracticably large training times. Typically, training data is preloaded into memory prior to training, a particular challenge for seismic applications where data is typically four times larger than that used for standard image processing tasks (float32 vs. uint8). Using a microseismic use case, we illustrate how over 750 GB of data can be used to train a model by using a data generator approach which only stores in memory the data required for that training batch. Furthermore, efficient training over large models is illustrated through the training of a 7-layer UNet with input data dimensions of 4096×4096 (approximately 7.8 M parameters). Through a batch-splitting distributed training approach, training times are reduced by a factor of four. The combination of data generators and distributed training removes any necessity of data 1 subsampling or restriction of neural network sizes, offering the opportunity of utilisation of larger networks, higher-resolution input data or moving from 2D to 3D problem spaces.Citation
Birnie, C., Jarraya, H., & Hansteen, F. (2021). An introduction to distributed training of deep neural networks for segmentation tasks with large seismic datasets. GEOPHYSICS, 1–41. doi:10.1190/geo2021-0130.1Sponsors
The authors would like to thank the Grane license partners Equinor Energy AS, Petoro AS, Var Energi AS, and ˚ ConocoPhillips Skandinavia AS for allowing to present this work. The views and opinions expressed in this abstract are those of the Operator and are not necessarily shared by the license partners. The authors would also like to thank Ahmed Khamassi and Florian Schuchert for their invaluable support on the data science elements of this project, as well as Marianne Houbiers for her insightful discussions on the application of DL for passive monitoringPublisher
Society of Exploration GeophysicistsJournal
GEOPHYSICSarXiv
2102.13003Additional Links
https://library.seg.org/doi/10.1190/geo2021-0130.1ae974a485f413a2113503eed53cd6c53
10.1190/geo2021-0130.1