1.1 Generative Adversarial Network
The generative adversarial network consists of a generator model (G) and a discriminator (D). The role of the generative model is to capture the data distribution and generate new data. The role of the discriminant model is to determine whether the data is real data or data generated by the generative model. Its basic structure is shown in Fig. 1:
The training set data vector Z-p(z) is used as the input to the generative model, The new data G(z) is generated after the generator network G. The input of the discriminant model D is either a real data sample or the sample G(Z) generated by the generator network. The discriminator network model is trained to determine whether its input is from the sample of real data or from the sample generated by the generator model. Then, the generator model is trained by the already trained discriminator model to generate data that more closely matches the real data distribution to deceive the discriminator. The two models play each other and are trained alternately to reach an optimal equilibrium point. At this point the generative model is able to generate data that is closest to the real data, The discriminator model cannot distinguish whether the data comes from real data or generated data. The GAN model is trained with a loss function of the following form:
$${\xi _{GAN}}(G,D)={{\rm E}_{x\sim pdata}}[\log D(x)]+{{\rm E}_{z\sim p(z)}}[\log (1 - D(G(z)))]$$
1
1.2 Long Short-Term Memory LSTM
Long Short-Term Memory (LSTM), which belongs to Recurrent Neural Network, LSTM is a modified version of a recurrent neural network. The original RNN has a hidden layer with only one state H. RNNs are very sensitive to short-term inputs and relatively weak for long-term inputs. At this point, a state C is added to the RNN, so that the RNN keeps a long-term state, thus constituting a long-short time memory network. It is usually used to process data sets with time series. LSTM can better capture the long-term dependencies between data. LSTM remembers the values in an arbitrary time interval by introducing a memory unit. Simultaneous use of input gates, output gates, and forget gates to regulate the flow of information in and out of the memory cell. Effectively solves the problem of gradient disappearance or gradient explosion of recurrent neural networks when the data size is too large. The structure of the neural unit of the LSTM is shown in Fig. 2:
First is the forgetting threshold layer. This layer is used to determine what data is to be forgotten. The forgetting gate produces a value 0 ~ 1 through the output of the previous neuron and an input variable after a sigmoid operation. The part of information near 0 will be forgotten, Instead, it continues to pass on in the united state again. This determines how much information is missing from the previous state\({C_{t - 1}}\).
$${f_t}=\sigma ({W_f} \cdot [{h_{t - 1}},{x_t}+{b_f}])$$
2
The function of the input threshold is to update the status of the old unit, this layer executes the information added or forgotten by the previous layer. A new candidate memory unit is obtained through the tanh layer. Update the previous state to \({C_t}\) under the action of the input layer.
$${i_t}=\sigma ({W_i} \cdot [{h_{t - 1}},{x_t}+{b_i}])$$
3
$${\tilde {C}_t}=\tanh ({W_c} \cdot [{h_{t - 1}},{x_t}+{b_c}])$$
4
$${C_t}={f_t} \cdot {C_{t - 1}}+{\tilde {C}_t} \cdot {i_t}$$
5
Finally, the output threshold determines what value is output. A sigmoid layer to determine which outputs are needed, then pass a \(\tanh\)layer to get a value between\(- 1\sim 1\). Multiply this value with the sigmoid value to determine the final output value.
$${o_t}=\sigma ({W_o}[{h_{t - 1}},{x_t}+{b_o}])$$
6
$${h_t}={o_t} * \tanh ({C_t})$$
7
1.3 GAN-LSTM converged network
When a coiled tubing jam occurs at the bottom of the well, it causes changes in parameters such as bottom-hole pressure, ROP, circulation pressure, and total weight. and then cause a change in the wellhead pressure and flow rate. Data for these parameters can be measured by sensors on the surface and downhole. These datasets are used to build deep-learning models to predict drilling parameters such as total weight and ROP.
Drilling history data is a typical time series data, with the characteristics of large data volume and large correlation between before and after data. Therefore, a recurrent neural network model can be used to predict drilling parameters. The disadvantage of recurrent neural networks is that they are prone to the problems of gradient disappearance and gradient explosion, resulting in poor generalization of the model. The properties of LSTM can compensate for the problems of recurrent neural networks in terms of a gradient. When the output of the LSTM is multiple variables, the accuracy of the model prediction is significantly lower than that of the model whose output is a single variable. That is, as the dimensionality of the output data increases, the prediction accuracy decreases. And the error rate of the model increases as the depth of the predicted parameters increases. In order to solve the above two problems, the generative model of GAN can be used to optimize the LSTM. With the powerful generative model in GAN, the low-dimensional data output from LSTM is used as the input to the generative model of GAN. The ultimate goal is to predict multiple variables and also to avoid the problem that the model accuracy decreases when the dimensionality of the output data increases. The GAN network model consists of a generative network model and a discriminative network model. The GAN-LSTM fusion network needs to use the generative network model of GAN, so the GAN and LSTM need to be trained separately during the training. The model structure of GAN-LSTM is illustrated in Fig. 3.
Step 1: Divide the original feature variables
A part of the variables is predicted by LSTM and the others are predicted by GAN. The LSTM part of the model was analyzed and experimentally attempted to predict total weight and ROP, and the GAN part predicted wellhead pressure and circulation pressure.
Step 2: Train the two models separately
LSTM: Input: Well depth, circulation pressure, wellhead pressure, ROP, and total weight. Output: ROP and total weight.
GAN: Inputs: Well depth, circulation pressure, wellhead pressure, ROP, total weight. Output: Wellhead pressure and circulation pressure.