An inverse problem is the process of predicting the causal factor from the outcome of measurements, given a partial description of a physical law-based system. Application of inverse theory is widely applied in the science and engineering disciplines, such as geophysics and computer vision (Kim and Nakata, 2018). Inversion of magnetotelluric (MT) data is one such application that uses surface measurements of the earth’s natural electric and magnetic fields as inputs to retrieve the subsurface resistivity structure for the interpretation of subsurface geology. The forward modeling procedure is based on the calculation of electromagnetic (EM) fields from a subsurface geo-electric model using numerical solutions of Maxwell’s equations (e.g. Ward and Hohmann, 1987; Zhdanov, 2002; Craven et al., 2006). Different inversion optimization methods, such as nonlinear conjugate gradient and Gauss-Newton have been studied previously (e.g. Constable et al., 1987; Zhdanov, 2002; Egbert and Kelbert, 2012). The main principle of these approaches is the use of an iterative optimization technique to minimize the parametric function, including a data misfit function and a stabilizing function. Some useful software packages have been developed successfully for use by the MT community in academia and industry, such as Occam and ModEM (Constable et al., 1987; Egbert and Kelbert, 2012; Kelbert et al., 2014). Although there are some advantages inherent to traditional inversion methods, such as delivering details of the electrical structure of the subsurface, these inversion methods still face a few challenges. The first issue is that an initial model is needed to avoid the objective function trapping the search in a local minimum. Additionally, forward modeling, especially for 3D geo-electric structure, is numerically challenging, requiring significant computational resources despite recent advances (Ansari et al., 2020) to avoid a lengthy computation time. This issue is exacerbated for iterative inversion schemes that creep toward a final model using multiple forward computations. It must also be borne in mind that non-linear inversion of band-limited non-precise data is unstable and non-unique (Zhdanov, 2002; Zhdanov et al., 2011). Although regularization methods and physical constraints may be employed to stabilize the inverse solution, an alternative technique such as deep learning (DL) that leverages modern GPU hardware and algorithms should be investigated to assess its ability to replace traditional inversion techniques.
The DL method has matured in the past few decades (Fukushima, 1980; Hochreiter, 1998), and more recently, increased in popularity in different disciplines (Chen et al., 2020; Long et al., 2014; Ronneberger et al., 2015) along with leaps in computer hardware development ideally suited for the underlying DL algorithms. As an increasingly popular method for data-driven inferences in geoscience disciplines, artificial neural networks (NN) have demonstrated a considerable number of successful applications in different areas. For example, similar to image segmentation in self-driving cars (Paszke et al, 2016) and medical image analysis (Ronneberger et al., 2015), a NN framework has been used to aid in the extraction of clay mineralogy and fracture mapping from scanning electron microscope images with U-net and Deeplab architecture (Chen et al., 2020). For optimization problems in geophysics, previous studies have shown DL and machine learning can be used to solve inverse problems. Kim and Nakata (2018) and Russell (2019) compared machine learning and geophysical inversion, and showed machine learning yields results with high spatial resolution. Das et al. (2019) deployed convolutional neural networks (CNN) for seismic impedance inversion and demonstrated promise in the performance of more accurate and faster seismic reservoir characterizations in comparison to traditional inversion techniques. The DL inversion procedure is computationally instant after the model is created, which can improve the inversion efficiency. To overcome the drawbacks of traditional EM inversion methods, it is worth investigating a DL architecture to retrieve the unknown geo-electric structure of the subsurface. Moghadas (2020), Puzyrev (2019), and Puzyrev and Swidinsky (2021) presented 1D inversion of EM induction data, controlled source EM and time domain EM (TEM) data using CNN. The two studies displayed impressive inversion results using a CNN architecture, suggesting the potential of CNN inversion for more widespread use in EM exploration. Liu et al. (2020) presented a sample-compressed neural network algorithm for MT inversion and an adaptive-clustering analysis algorithm for resistivity boundary demarcation, and Guo et al. (2020) employed a supervised descent method (SDM) for 2D MT data inversion to reduce uncertainty. Conway et al. (2019) applied a non-convolutional neural network model to approximate the 3D MT forward method instead of resolving the expensive forward functions, speeding up the inversion procedure and producing reasonable resistivity models. In spite of these advances, further study of MT inversion using CNN has yet to be performed.
Because a CNN may fail to respond optimally to the entirety of scales inherent to the training data, the task of applying CNN model training can be challenging. In order to reduce the number of datasets required for CNN training and to improve predictive capabilities and flexibility, Long et al. (2014) and Roy et al. (2018) have developed fully convolutional networks by re-architecting and fine-tuning classification nets to use direct dense prediction of semantic segmentation, and Ronneberger et al. (2015) have modified the fully convolutional network by adding more feature channels in the upsampling part, and applied the U-net architecture for medical image analysis. Based on previous study of CNN architectures for semantic image segmentation, the DeepLabv3 + architecture with decoder modules were applied by Chen et al. (2017) using an atrous convolutional operator with encoder-decoder network structure for solving the segmentation task.
In this study, we implement a multi-head CNN with residual block, which permits the use of measured MT resistivity and phase data and facilitates working in a multi-scale context. We introduce a workflow of dataset generation, model training, and validation, and then assess this CNN inversion method with a study of synthetic and real data from Athabasca Basin. We also test the sensitivity of our method to data noise and compare the results with traditional inversion method, which shows the advantage of the proposed CNN inversion method.