Insights into the Prediction of Condensate Viscosity Near the Wellbore by ELM and ANFIS-PSO Strategies

: By lowering the pressure beneath the dew point as the result of production in gas condensate (GC) reservoirs, liquid droplets are formed in the borehole zone. Accurate prediction of production and optimization in these reservoirs requires specific properties such as liquid viscosity. Empirical models have already developed to predict this parameter. Due to the peculiar behavior of fluids beneath the dew point pressure (DPP), the prediction of liquid viscosity associates with an error. With the development of machine learning (ML) approaches, studies on fluid properties like other sciences have entered in a new phase. In this study, extreme learning machine (ELM) and adaptive neuro-fuzzy inference system with particle swarm optimization (ANFIS-PSO) methods applied to this end. Therefore, a big data bank including reservoir and fluid properties including reservoir temperature and pressure, specific gravity (SG) of gas, API gravity, and gas to oil ratio (Rs) were used. The results showed that R-squared and RMSE for ANFIS-PSO are 0.755 and 0.15, respectively, while these values are 0.889 and 0.06 for ELM which shows that the last model has a better performance in estimating output values. Also, the range of reliable data is determined, and further, a sensitivity analysis was done, which showed that the greatest impact on the viscosity was from SG and API gravity has the least effect on it. This model can be used as a reference for calculating condensate viscosity and also by expanding the range of datasets, it can be applied in the commercial software.


Introduction:
As the pressure of the GC reservoir drops beneath the DPP, the droplets of liquid elicit from the gas and makes a two-phase flow in the borehole zone.Accumulation of liquid in this area grows over time and generally its amount is excessive in the reservoirs containing rich condensate content.This effect is addressed as "liquid banking" and may lead to a drastic decrease in productivity 1 .To realize such complicated behavior in these reservoirs to predict and optimize future production, it is necessary to ascertain the viscosity of condensate liquid at pressures lower than the DPP 2,3 .It can be said that determination with error of this parameter leads to misdiagnosis in estimating the behavior of these reservoirs.It was shown that a low percentage of error in determining liquid viscosity leads to the same amount of error in calculating reservoir accumulative production, which is a significant amount 4,5,6 .
Experimental measurement of liquid viscosity in these reservoirs is a difficult process because of the inaccessibility of the samples, the limitation in establishing HPHT condition at laboratory equipment, small volume of measuring cell, and time consuming and expensive nature of laboratory tests.This can lead to a tendency toward using theory-based correlations 4, 5, 7 .Based on the input parameters, these correlations are categorized into two groups; the first one contains semi-experimental models that use some of the reservoir fluid properties, such as acentric factor, critical temperature, pour point temperature, fluid composition, boiling point, and molecular weight.The second group is the correlations that employ SG, reservoir pressure and temperature and Rs as their inputs 8,9 .These correlations are developed for three states of dead, saturated, and undersaturated oils.In the borehole zone, the viscosity of liquid condensate is as low as 0.1 to 1 cP 5,10 .The GOR in condensate reservoirs ranges from 3000 to 150000 scf/STB and the API gravity of these reservoirs ranges from 40 to 60 11,12 .These conditions limit the use of these correlations.
The amount of condensate viscosity in terms of fluid composition can be estimated by Lohrenz et al. correlation 13 .This model is one of the most popular available models in prediction of viscosity values and also it uses in some of commercial simulators.The mentioned model, also known as the LBC model, was originally designed to determine the viscosity of heavy gas mixtures found on the study of Jossi and his colleagues 14 .The LBC model predicts gas phase viscosity well enough in condensate reservoirs, while its prediction for the liquid phase in these reservoirs is extremely weak 6 .Therefore, it is essential to alter the LBC model by balancing the coefficients to fit the laboratory data.This approach is chosen because it considers compositional variations dependent on the reduced density that is a feature of GC reservoirs under DPP 15,16 .
Correlations for the viscosity of gas-saturated or live oil have suggested.Yang et al. (2007)   developed one of those correlations that predicts fluid viscosity as a function of Rs, reservoir temperature, reservoir pressure, and liquid and SG of gas 6 .There are also other proposed correlations, each covering a specific range of Rs, temperature, SG and viscosity of the liquid 17, 18, 19, 20, 21 .All of these correlations are derived from crude oil, which has a varying different composition from condensate liquid.They are also a straight function of the crude oil viscosity, and as a result of the huge effect of crude oil components on its viscosity, the predicted value of this parameter by the correlations is very unreliable 12,22 .Due to significant changes in the viscosity of condensate, empirical and semi-empirical correlations cannot quite represent changes in viscosity with pressure in the GC reservoirs 4,12,15 .
In recent years, ML methods have entered into the field of the oil industry due to their expansion and advancement in solving complicated problems 7,23,24,25 .These techniques were used in GC reservoirs to estimate condensate to gas ratio, DPP, and gas compressibility factor 26, 27, 28, 29, 30 .The main use of machine learning methods in GC reservoirs is to calculate the DPP.Jalali and his colleagues 31 , Ahmadi and Ebadi 26 , Majidi et al. 32 , Zhong et al. 33 have used ML methods for the estimation of this parameter.Zendehboudi et al. 29 used a PSO-ANN method to achieve a model for prediction of condensate to liquid in GC reservoirs.Ghiasi et al. 28 have used a LSSVM method for prediction of gas compressibility factor in these reservoirs.Recently, Mahdaviara et al. 34 employed ML methods and created some reliable models to predict the relative permeability of GC reservoirs.
Investigation and prediction of condensate viscosity is of great importance and has received very little attention.In this study, it has tried to model the viscosity of condensate liquid by applying ELM and ANFIS-PSO methods and compare the results with LSSVM and ANN models.For this aim, over 280 experimental data points have used.These dada point contain characteristics including pressure, API gravity, temperature, SG of gas, Rs, and viscosity.To evaluate the efficiency of the models, some statistical parameters were measured.Sensitivity analysis and investigation of suspicious data are other tasks performed in this study.

Data gathering
It is worth mentioning that generating an accurate model for predicting the value of liquid condensate viscosity is highly difficult when there is no sufficient and detailed data in the GC reservoir.That said, the current study aims to introduce a new model for predicting the viscosity of liquid in GC reservoirs in a number of experimental testing data set.The model accounts for and utilizes any factors that can influence the liquid condensate viscosity.In this work, in order to train and verify the proposed model, 283 experimental data sets have been used.It should be noted that the most influential variables considered for the model included reservoir temperature, reservoir pressure, API gravity, SG of gas, GOR, and the liquid viscosity from GC reservoir.75 % of the data were used for training phase, while the other 25 % were used for testing the model.The target variable for the proposed models was the condensate liquid viscosity, while the input parameters included other above-mentioned parameters.The PVT report of GC and the experimental analysis of liquid condensate is the basis of our database.
Different methods have employed to measure the viscosity of the liquid phase, including the use of rolling ball, capillary, and electromagnetic pulse technology viscometers.These data were collected from several articles 1,2,3,4,6,35,36,37,38,39,40,41,42 and were included in the section of Supplementary data of this paper.

ELM
Huang has developed a novel and innovative computational training approach, which utilizes a single-layer feedforward neural network (SLFFNN) as its basis 43,44 .This innovative approach is invoked to as ELM.When used in gradient-based algorithms, one of the main problems with the ELM is its low rate of training, which is determined by adjusting the parameters of the network.In this method, N number of training samples are assumed in the form of (  ,   ) ∈   ×   , where   ×   signifies a space with  ×  dimensions.This method randomly selects the latent neurons, while the SLFFNN sets the output weights in an analytical manner using the Moore-Penrose generalized inverse of the matrix.
It should be noted that in this equation,  signifies the activation function, while   = [ 1 , … ,   ]  and   = [ 1 , … ,   ]  denote the output and input weight matrices, respectively.Moreover, the bias is denoted by  in this equation.
In which, the hidden later output matrix is denoted by () = [ℎ 1 (), … , ℎ  ()].The initial step in this algorithm involves randomly determining the input weight and hidden layer bias.
Afterward, using input variables, it becomes probable to determine the hidden layer matrix.
Next, through training the SLFFNN, a least square problem is generated.Furthermore, the cost function of the ELM algorithm could be asserted as follows 45,46,47 :

ANFIS-PSO
Reference 49 briefly describes the ANFIS-PSO hybrid model.In order to identify the parameters of the optimized Gaussian membership function (GMF) of the presented ANFIS model, the PSO algorithm was utilized.The ANFIS approach was developed by Jang 50,51 as a adaptable and highly intelligent hybrid system.The ANFIS method may be considered as the result of full integration of the neuro-fuzzy system (FS) and computing activities 52 .
Furthermore, the ANFIS approach combines natural and neural networks to take advantage of the strengths of both of them.In order to create the necessary basics of FS, ANFIS utilizes the back-propagation calculation from the data collection procedure.The framework of the ANFIS approach is based on organizing a series of fuzzy  −  rules that possess the learning ability for estimating nonlinear functions.The fundamental principles of ANFIS are somewhat similar to those of a FS proposed by Takagi-Sugeno-Kang 53,54 .The reverse spread learning feature of ANFIS is found on the backward computation of derivatives of squared errors from output nodes to input nodes.Using this feature, ANFIS generates and uses a robust learning approach which is rooted in the gradient least squares method.Moreover, the least square approach is employed for identifying the consequence factors in the forward section.
Afterward, the gradient descent will reset the preset parameters in the regressive advance 55 .It is worth mentioning that the adaptive network consists of five distinct layers, which are presented in Figure 2.Moreover, Figure 2 also depicts the nodes of these layers as well as their connections with the supposition of two inputs to the fuzzy reference system, which are denoted by  and , and a single output denoted by .In order to better understand ANFIS configuration, it should be mentioned that it utilizes two fuzzy  −  rules that follow the Sugeno Fuzzy Inference System (FIS) as follows: Figure 2. ANFIS method structure 56 The first layer in this structure is the fuzzification layer, which generates all the membership grades for individual variables.Moreover, the node functions of the layer could be expressed as: The memberships in the fuzzy set are denoted by (  ,   ), while the value obtained from the  − ℎ node of the first layer is denoted by  1, .Moreover, the nodes of the second layer generate the input signals, while the nodes in the third layer are employed for calculating the following parameter: In this equation,   denotes the ruled firing strength of node , where its normalized firing strength equals   .On the other hand, the results from the fourth layer can be expressed as: In this notation,   ,   ,    are referred to as consequent parameters.Finally, the nodes in the fifth layer compute the general output, which could be expressed as follows: It is worth mentioning that ANFIS has provided promising results in a host of various applications in terms of generating prediction models 57,58,59,60 .Nonetheless, the quality and accuracy of the modeling can still be significantly boosted by optimizing the parameters of the model 49 .Therefore, in order to improve the parameters and the solutions provided by the ANFIS system, a large number of various optimizations methods, including PSO, have been proposed 61 .In comparison to other optimization methods, the PSO method provides remarkable results.That is why the current study employs this algorithm.The PSO is based on the behavior of birds searching for food 62,63 .In this algorithm, the particles utilize the information available to them as well as the information available to other particles to update their locations and pathways.Therefore, it was suggested that the particles in this algorithm have access to a memory function.Moreover, the competition and the collaboration among the particles form the basis for this optimization process.In cases where the PSO algorithm is used to find solutions for optimization problems, the pathways and velocities of the particles can be used to determine the state of the individual particles.In order to describe the attributes of a particle, three vectors, i.e.,   (the current position of the particle),   (the current velocity of the particle), and   (the best spatial position sought by the particle) are introduced, while   signifies the optimal solution sought after by the entire set of particles.The following formula is used in this algorithm to gradually update the position and pathway of the individual particles: ( + 1) =  () + ( + 1) In this formula, () signifies the velocity of the particle in the  ℎ and  + 1 ℎ iterations.
Moreover, () signifies the position of the particle, while 1 and 2 indicate learning constants with 1, 2 > 0. Finally, () signifies a random number in the span of [0, 1].The process for updating the velocity of a particle is expressed in Formula ( 12).This formula is based on the historical velocities of the particle as well as the personal and global best positions 64 .

Results and Discussion:
As mentioned before, the liquid viscosity in GC reservoirs is a function of the variables of fluid property and reservoir conditions.The objective of this work is to model viscosity in terms of these variables.For this purpose, ELM and ANFIS-PSO approaches have been used, which their explanations presented in the previous section.More information and details of these models are given in Table 1.

Swarm size 80
The performance and accuracy of the models are measured found on a series of parameters.
The used parameters to compare ELM and ANFIS-PSO models are: These values for both models are reported in Table 2.An overview of this table shows that the ELM is more accurate than the ANFIS-PSO.R 2 for ANFIS is equal to 0.755, while this value for ELM is 0.889.Furthermore, errors related to the parameters in the ELM model are far less than in the PSO-ANFIS.According to the results of their study, the "coefficient of determination" and other statistical characteristics of ML methods were superior to all the investigated correlations, so these methods are more accurate than before correlations.According to the results, R 2 is equal to 0.889 and 0.755 for ELM and PSO-ANFIS models (our work), respectively, while it was 0.774 and 0.842 for LSSVM and ANN models.Also, the RMSE value for ELM and PSO-ANFIS is 0.06 and 0.15, and this value was 0.12 and 0.11 for LSSVM and ANN models.Therefore, it can be concluded that the ELM is better than the models of Faraji et al., while the PSO-ANFIS model is not superior to these models.
MF parameters are plotted in Figure 3. Also Figure 4 shows that the number of iterations which is required to minimize RMSE in the PSO-ANFIS method is 1000.

number of iterations versus RMSE
There is a proper match between the estimated model for liquid viscosity and its empirical amounts in both models.To evaluate the accuracy of these two strategies, the overlap of experimental data and predicted values for viscosity could be reviewed.Figure 5 shows the estimated value of viscosity versus laboratory data.According to this plot, the accumulation of the points along the line of equality in the ELM is more than the ANFIS-PSO, which indicates the better performance of the ELM model.The accuracy of experimental data is important for creating an authentic model.Therefore, it is necessary to assign methods to determine whether data is acceptable or not to create a model.
One of the approaches that have this capability is Leverage method which is formulated by H equation below 66,67 : In the aforementioned definition, U exhibits a matrix with i and j dimensions, in which i and j indicate the parameter numbers of the model and the training set data.
Critical limit of Leverage is another significant parameter in this approach which is designate as: Standard residual is used to distinguish the acceptable data, which set to +3 and -3. Figure 7 shows the distribution of acceptable and suspected data.As shown in this Figure for both strategies, the number of suspected data points are low and majority of them are reliable, which indicates that both of the models are authentic.(21)   In this equation, X and Y represent the input and output of the model.A positive sensitivity parameter for an input indicates a direct relationship with the output, and a negative one means a reductive effect.Also, the higher the value of the sensitivity parameter means that the effect of input is stronger.According to Figure 8, the relationship between pressure, GOR, and API gravity is positive for viscosity, and the effect of SG of gas and reservoir temperature on viscosity is reductive.

Conclusion
Accurate estimation of properties such as liquid viscosity in GC reservoirs plays a significant role in predicting production and other studies on reservoirs and wells.In this study, this feature modeled using ELM and ANFIS-PSO methods.These models obtained applying a big data bank containing liquid viscosity associated with reservoir temperature, reservoir pressure, API gravity, SG of gas, and Rs.According to the results, both models are reliable, but ELM showed a better convergence as R 2 , MRE, MSE, and RMSE values in ELM are more favorable than PSO-ANFIS.Both models can replace experimental methods for measuring liquid condensate viscosity, which are very time consuming and costly, and can also be used in commercial software to facilitate the computational process.

Figure 1
Figure 1 depicts the general structure of the ELM algorithm.Moreover, the ELM utilizes the

Figure 3 .
Figure 3. Membership function parameters in developed ANFIS model

Figure 4 .
Figure 4.number of iterations versus RMSE

Figures
Figures Figure 1 Figure 2

Figure 3 Membership
Figure 3

Figure 4 number of iterations versus RMSE Figure 5 6
Figure 4

Table 1 :
Detail information of developed ANFIS and ELM models

Table 2 .
65atistical analyses for models.The models prepared in this study can be compared with the developed models in the study of Faraji et al.65.Faraji et al. (2019) modeled condensate liquid viscosity using ML methods.To do this, they applied LSSVM and ANN methods to forecast condensate liquid viscosity.