Equivalent Circulating Density (ECD) is a crucial factor in drilling operations as a lack of proper monitoring can cause serious problems such as formation fracturing and circulation loss, affecting well control1,2. ECD is a measure of the pressure exerted by the mud hydrostatic column and annular losses, indicating the mud's pressure against the formation during drilling and circulation. Drilled cuttings can increase the effective drilling fluid density and reduce fluid flow area, increasing the value of the ECD. Therefore, real-time ECD calculations or predictions are crucial for monitoring hole-cleaning dynamics during drilling operations. ECD plays a critical role in preventing kicks and fluid losses, especially in deep, high-pressure, high-temperature (HPHT) wells where temperature fluctuations are significant and the margin between pore pressure and fracture pressure is narrow.

Apart from annular pressure losses and cuttings concentration in the drilling fluid, other factors affecting ECD during drilling operations are wellbore geometry, mud properties such as density and viscosity, mud pumping rate, and downhole pressure and temperature3–6. These variables collectively play crucial roles in determining the ECD levels encountered during drilling activities, necessitating careful monitoring and management to ensure operational efficiency and well integrity to prevent adverse, well-control incidents.

Traditionally, determining ECD requires mathematical models or expensive downhole tool measurements. Mathematical models are used to generate hydraulic programs, essential tools for conducting hydraulic calculations during drilling operations. These programs could leverage various rheological models, such as Bingham plastic, Power-law, or Herschel–Bulkley, which form the basis of hydraulic calculations. However, each rheological model requires different data inputs, which are typically obtained through comprehensive lab tests. These parameters are then inputted into the software to calculate drilling fluid hydraulics and related parameters such as Equivalent Circulating Density (ECD). Although these hydraulic programs are sophisticated, discrepancies have been observed between the computed data and the actual field-recorded data. This disparity can occur due to variations in downhole conditions, inaccuracies in input parameters, uncertainties in the rheological model assumptions, and potential errors in the measurement equipment6–8.

On the other hand, the use of downhole pressure sensors emerges as a superior method for evaluating Equivalent Circulating Density (ECD) when compared to the hydraulic models. These sensors are equipped with high-precision pressure gauges specifically designed to measure annular pressure with precision. By providing real-time downhole pressure data, these sensors enable the drilling crew to promptly make well-informed decisions, thereby enhancing operational efficiency and reinforcing well control measures. However, it is crucial to recognize that such sensors come with a significant cost and may also be operationally restrictive in some projects 5,6,8.

## 1.1 Summary of previous Machine Learning (ML) studies related to ECD predictions.

To address and minimize the discrepancies of the hydraulic models and mathematical correlations and predict ECD relatively cheaply by leveraging surface drilling operations data. Some researchers have attempted to predict ECD using machine-learning approaches. Some of those key studies are highlighted in the next paragraphs.

Alkinani et al.9 an Artificial Neural Network (ANN) model with a single hidden layer comprising 12 neurons to predict ECD. They incorporated drilling parameters (drill pipe revolutions per minute (RPM), and weight on bit (WOB) along with hydraulic and mud properties such as mud pumping rate, mud density, plastic viscosity, yield point, and total flow area (TFA) for bit nozzles. Abdelgawad et al.5 a predictive model for ECD employing both ANN and Adaptive Neuro-Fuzzy Inference System (ANFIS). Their ECD-ANN model featured one hidden layer with 20 neurons, while the ANFIS model utilized five membership functions, with Gaussian membership function (gaussmf) as the input and linear type as the output membership function. The features used for prediction in this study are mud weight, drill pipe pressure, and Rate of Penetration (ROP). Rahmati and Tatar 10 employed radial basis function from 884 data points obtained from literature to construct an ECD prediction model. The input variables to the model were the type of mud, initial density, pressure, and temperature. Ahmadi11,12 in their studies also used pressure, temperature, and initial density as model inputs to predict ECD. Alsaihati et al8 utilized seven drilling parameters (Flow rate, Standpipe pressure, hook load, WOB, ROP, Torque, and drill string speed (RPM)) as inputs to predict ECD. Gamal et al.6 six (6) drilling parameters such as mud pumping rate (GPM), ROP, drillstring speed in RPM, stand-pipe pressure (SPP), WOB, and drilling torque (T).

Al-Rubali et al 13 used a combination of drilling parameters (mud pumping rate (GPM), drill string rotation (RPM), ROP, stand-pipe pressure (SPP) and mud properties (mud weight (MW), Plastic viscosity (PV), Yield point (YP) and Low Shear Yield Point (LSYP) as inputs in their ECD prediction model. They also considered angles of borehole and azimuth, modified average cuttings concentration in an annulus, modified hole geometry factor, and other factors.

## 1.2Novelty of Study

Undoubtedly, the literature underscores the efficacy of AI models in enhancing ECD (Equivalent Circulating Density) prediction. However, these models vary significantly in terms of input parameters, training data, and methodologies. A prevalent issue identified in numerous studies is that they have included downhole pressure and temperature data as inputs for these prediction models. From an operational perspective, acquiring such data with precision often mandates installing and maintaining downhole sensors, thereby escalating operational expenses and data collection time. To this end, like the more recent studies6,13 on this subject matter, in this study, we will also not be using downhole parameters as input into our model.

The novelty of this study is a combination of the following objectives:

- Exploration of different surface-based drilling parameters (ROP, WOB, hook load, Surface RPM, Surface Torque, Pump flow (GPM), Standpipe pressure, MW in, MW out)
- Exploration of the impact of formation gas as recorded by the chromatograph.
- Use of a highly efficient tree-based machine learning algorithm - XGBoost
- Exploration of the SHAPley approach for model predictions interpretability and explainability