Study on the Interpretation Method of Low Resistivity Contrast Oil Reservoir Based on Support Vector Machine—Taking the Chang 8 Tight Sandstone Reservoir of Yanchang Formation in Huanxian Area, Ordos Basin as an Example


 Low resistivity contrast oil reservoirs are subtle reservoirs that have no obvious difference in physical and electrical properties from water layers. It is difficult to identify based on the characteristics of the geophysical well logging response. Especially in tight sandstone reservoirs with low porosity and low permeability, the log interpretation effect of low resistivity contrast oil layers is worse. In recent years, data mining technology has been increasingly applied in oil exploration and development, especially for some complex reservoirs with unclear logging response characteristics, and how to use data mining technology to effectively solve some complex problems is of great significance in oilfields. Therefore, support vector machine (SVM) technology was applied to interpret the low resistivity contrast oil layer in this paper. First, the input data sequences of logging curves were selected by analyzing the relationship between reservoir fluid types and logging data. Then, the SVM classification model for fluid identification and SVR regression model for reservoir parameter prediction were constructed. Finally, the two models were applied to the logging interpretation of the Chang 8 tight sandstone reservoir of the Yanchang Formation in the Huanxian area, Ordos Basin. The application results show that the fluid recognition accuracy of the SVM classification model is higher than that of the logging cross plot method and BP neural network method. The calculation accuracy of permeability and water saturation predicted by the SVR regression model is higher than that based on the experimental fitting model, which indicates that it is feasible to carry out logging interpretation and evaluation of the low resistivity contrast oil layer by the SVM method. The research results not only provide an important reference and basis for the review of old wells but also provide technical support for the exploration and development of new strata.

SVM classification model is higher than that of the logging cross plot method and BP neural network method. The calculation accuracy of permeability and water saturation predicted by the SVR regression model is higher than that based on the experimental fitting model, which indicates that it is feasible to carry out logging interpretation and evaluation of the low resistivity contrast oil layer by the SVM method. The research results not only provide an important reference and basis for the review of old wells but also provide technical support for the exploration and development of new strata. In recent years, data mining technology has been increasingly applied in oil exploration and development, especially for unconventional reservoirs with unclear logging response characteristics, and how to use data mining technology to effectively solve some complex problems existing in the actual production of oil fields is of great significance (Kadhim et al., 2017;shahriari et al., 2020). Some classical optimization algorithms, such as the neural network method, support vector machine and fuzzy clustering method, provide a new technology for the identification of low resistivity contrast oil reservoirs (Wang et al., 2015;Liu et al., 2017). Guo et al. (2015) predicted the water saturation at the lower limit of three water models by using the generalized neural network (GRNN) and particle swarm optimization support vector machine (PSO-SVM), which is in good agreement with the core analysis results in the Sulige tight sandstone reservoir. Chen and Peng (2020) used a BP neural network to train and learn the mathematical characteristics of logging curves of low resistivity oil reservoirs, which improved the accuracy of fluid identification and reservoir parameter prediction. Basin, and the regional geological structure crosses the Tianhuan Depression and Yishan Slope from west to the east (Figure 1a). The tight sandstone reservoir of the Chang 8 member of the Yanchang Formation developed in the Huanxian area has a large sedimentary thickness, and the oil source mainly comes from the overlying Chang 7 high-quality source rock, which has great exploration and development potential (Zhao et al., 2013;Chai et al., 2020). With the deepening of oil and gas exploration and development in this area, the problem of logging interpretation and evaluation of low resistivity contrast reservoirs has become increasingly prominent. Figure 1 (b) shows the relationship between the resistivity and density logging response of different pore fluids established according to the oil test data in the study area. The reservoir density and resistivity of the low resistivity contrast oil layer and water layers are low, and it is difficult to identify and evaluate low resistivity contrast oil layers by using conventional logging data, which seriously restricts the exploration progress and development of oil resources in this area. It is important to develop more effective methods to provide new logging technical support for the exploration and development of low resistivity contrast oil layers in unconventional reservoirs. as the input data, where i x is the logging data related to the predicted parameters, and y i is the core analysis data, that is, the target value.
Suppose that in high-dimensional space, the hyperplane or line function that can separate the two types of samples satisfies: where ij w is the weight vector representing high-dimensional unknown coefficients and b ij is a constant term. To use function (1) to distinguish all input data samples without error, function k y ( w w should be minimum. In this way, the problem of solving the optimal hyperplane in high-dimensional space is transformed into the minimum value problem of the following convex programming function: Which satisfies the following constraint condition: where k  is a nonnegative relaxation variable introduced when the sample data are linearly inseparable; C is a penalty parameter, and the greater its value is, the heavier the penalty for misclassification. The first term in the objective function (2) is to increase the classification interval, which effectively controls the generalization ability of the model. The second term is the training error to reduce the experience risk.
To map the training data set to the high-dimensional space, a kernel function needs to be introduced; that is, the convex programming problem of equation (2) is transformed into a quadratic programming problem. The expected weight vector can be written as

The construction of SVM classification model
Fluid identification using SVM is a multiclassification problem, but the SVM method initially solves the two classification problems. Therefore, it is necessary to extend the SVM and construct a reasonable multiclassification coding scheme. At present, there are four main methods to construct SVM multiclassifiers: "one against one", "one against rest", "SVM decision tree" and "one-time solution method". When solving practical multiclassification problems, the "one-to-one" method has a better effect than other methods (Hsu et al., 2002;Peng and Zhang, 2007). Therefore, this method is selected to construct an SVM multiclassifier in this paper, and the basic idea is that if there are class k data, class I data and class J data are selected to construct a classifier, where I < J, so k (k-1)/2 classifiers need to be trained. For class I and class J data, a two classification problem needs to be solved, and the voting method is used to solve this problem; that is, if the function judges that it belongs to class I, the number of votes of class I is increased by 1. Otherwise, the number of votes of class J is increased by 1, and the final output result is the class with the largest number of votes.
To build the SVM classification model for fluid identification, we must first determine the input logging data or parameters sensitive to the pore fluid. Considering that the study area is mainly conventional logging curves, nuclear magnetic resonance logs and array acoustic logs are not widely used in the whole area. Therefore, according to the characteristics of logging curve, the fluid identification factors sensitive to fluid type are selected as the input data, including  is the comprehensive physical property index, which K represents the permeability, and  is the porosity of reservoir. QT is the total hydrocarbon logging value, the greater the value, the greater the probability of possible oil and gas. Rt is the resistivity logging value. The specific calculation methods of other parameters are as follows: where ΔSP is the relative amplitude of the spontaneous potential. When the salinity difference of formation water is small, the higher the oil saturation of the reservoir is, the smaller the ΔSP value; SP is the spontaneous potential logging value; and Shale SP and sand SP are the spontaneous potential values of pure mudstone and pure sandstone, respectively.
where R D is the resistivity difference parameter, and AT10 , AT20 , AT30 , AT60 and AT90 are the resistivity logs at depths of 10 in, 20 in, 30 in, 60 in, and 90 in the wellbore, respectively. The value of R D is large for the oil layer, while the value for the water layer is small.
where wa R is the apparent formation water resistivity calculated by the Archie formula when the reservoir water saturation is assumed to be 100%, m is the cementation index, and a and b are the cementation indices. The input sample set data have different physical meanings and different dimensions and orders of magnitude, and it is necessary to normalize the original data before learning and training. The normalization method selected in this paper is the mapminmax function, and its normalization formula is: where $ x is the normalized data, x is the input data, max x and min x are the maximum and minimum values of the input data, and the range of normalized data is between -1 and 1.
The libsvm toolbox in MATLAB software is used for SVM classification model learning and training, and the radial basis function is selected as the kernel function, that is, .

SVR regression model
The permeability and saturation of unconventional reservoirs are seriously affected by pore structure, and it is difficult to obtain these two parameters based on conventional logging curves. Therefore, the support vector regression (SVR) method is considered to construct the prediction model of reservoir permeability and saturation.
The idea of using SVR to build a reservoir parameter prediction model is the same as the basic process of the SVM classification model, which is to first select the optimal dataset with high correlation to the prediction target value as the input. The relationship between permeability, saturation and logging curve is very complex. To determine the appropriate input training set, different logging data set combinations were used as the input training data, and the optimal input data set was selected by comparing the errors of the prediction model. The combination of different input logging data sets is shown in Table 2 illustrates that the porosity data calculated by conventional methods can improve the accuracy, but it is not obvious, which also shows that the reservoir saturation is mainly related to the electrical and comprehensive physical properties of the reservoir. Therefore, the optimal input training data set by the SVR regression permeability model is finally selected as combination 4, and the optimal input training data set by the SVR regression saturation model is combination 5.    calculated by the SVR regression model and conventional method, respectively.

Conclutions
(1) There is no obvious difference in physical and electrical properties between the low resistivity contrast oil layer and water layer in the tight sandstone reservoir of the Chang 8 member in the Huanxian area, Ordos Basin. It is difficult to effectively identify and 油：23.63t/d 水：0m 3 /d evaluate low resistivity contrast oil layers by using conventional logging data, which seriously restricts the exploration progress and development benefits of oil and gas resources in this area.
(2) This study analyzed the relationship between the logging response and pore fluid to optimize the input training dataset. The SVM learning method was used to construct the SVM classification model and SVR regression model for fluid identification and reservoir parameter prediction.
(3) The application results show that the SVM classification model has higher fluid identification accuracy than the BP neural network method and conventional fluid recognition method (cross plot of porosity and resistivity log). The reservoir permeability and saturation predicted by the SVR regression model are more consistent with the core analysis results, which proves that the SVM method is effective and feasible for low resistivity contrast oil reservoir interpretation.