It is necessary to assume the churned customer as early as possible and meet their expectations in order to avoid it. Thus, a newly developed RNN-based customer CP with an improved EH optimization plan module is suggested for the telecommunications sector. This paper's major goal is to foretell if a consumer will leave or not. Extending packages or providing additional incentives or services to consumers who are inclined to migrate from the present service provider to another is an essential measure to stop customers from doing so. Because, the businesses that may produce the most revenue are the ones who provide these services. In this paper, we use client churn classification model with optimization algorithm to find out the elevated customers as depicted in below Fig. 1.
The data are collected as input for the telecom CP dataset. The initial pre-processing is then applied to the volume of data that has been gathered. The consumers are then screened based on their respective states and regions. The clustering algorithm then combines comparable clients centred on the state and the area together. The clustered data are then once more put through pre-processing, where they are numeralized and normalised. Consequently, complexity can also be avoided. The most important and necessary features are then extracted from the pre-processed data. The improved model of the EH optimization technique and the most useful tuning parameters were derived from the appropriately chosen characteristics. The classifier receives these chosen features as input. R-RNN effectively predicts whether or not a consumer will leave. If the client is predicted to be a CC, the customer's network usage history is reviewed. The equivalent threshold amount is fixed and centred on how much the client uses the network.
If the client uses the network extensively, the retention process is carried out to keep them in the same network. The consumer who uses the network sparingly won't be paid attention to.
4.1 Data collection Process
The data are primarily collected from the telecoms CP dataset. This dataset contains information about the demographics of the customers, their network usage history, their accounts, and other things.
4.2 Initial preprocessing Process
The preliminary pre-processing stage is carried out after data gathering. The obtained data are preprocessed by removing the duplicate customer entries from the dataset, and then they are transformed into a usable format.
4.3 Filtering and grouping Process
The data are filtered once from the prior result. The distinctive characteristics of the clients can be determined from this. By doing this, more analysis may be completed with less time and price.
Information about consumers from different states and countries is available in the telecom customer records, which can be used to identify the unique property. Analyzing global consumer records is a genuinely complex endeavour. Therefore, the customer records of the individual states and areas are aggregated and arranged into a cluster in order to alleviate these obligations (Cl). By using Euclidean distance computations, this is connected to its nearest medoid.
Cl ∗ = {Cl1, Cl2, Cl3,. . . ..Cln} (1)
4.4 Data Feature extraction Process
Two processes, such as numeralization and normalisation, are carried out by the function to find the feature extraction task. The pre-processing function's mathematical representation is produced by,
p re = Qρ[Cln] (2)
Where, ρre represents the results of the preprocessing function, Cln implies the clustered data input and Qρ signifies the preprocessing functions, i.e. symbolized by,
Q ρ = [QNu, QNo] (3)
Where QNu infers the numeralisation and QNo infers the normalization mode.
The preprocessed data's string values or characters are transformed into a numerical representation known as Numeralisation. On the other hand, numerical data are created from the clustered data. It is developed the numeralization function as,
Ń Nu = QNu[Cln] (4)
Where ŃNu describes the results of numeralization function.
The normalisation technique improves the model's performance and training stability. The normalisation of the data uses log scaling. It calculates the log of the numbers and condenses a wide range into a small range. The log scale normalization is stated by,
ŇNo = log(Cln) (5)
Where Cln signifies the original value and ŇNo implies the normalised value.
The Feature Extraction (FE) aims to lessen the total features so that the total resources required to process such huge data can well be lessened. The mathematical expression for the FE(pre) is rendered by,
FE(p re ) = {FIp, FTd, FNv, FVm, FTdc, FTe, FTeh, FTdh, FTec } (6)
Where International plan FIp, total minutes in day FTd, number messages in vmail FNv, plan of voice mail FVm, total calls in day FTdc, total eve minutes FTe, total day charge FTdh, total eve calls FTec, total eve charge FTeh are the essential and appropriate characteristics.
4.5 Feature selection Process
It is possible to increase the model's accuracy, computing speed, and memory. The FS method thus considerably improves the CP's performance. The improved EHO is created for better FS. Additionally, it is utilised to fine-tune the neural network model's predetermined parameters.
4.6 Enhanced EHO Algorithm
1: Start off the population.
2: Reinforce or reiterate
3: Sort the elephants according to their fitness value, which was calculated in step three using the fitness function.
4: Apply the clan-updating process in step four. The matriarch, the elephant in clan i with the greatest fitness score, directs the movement of every elephant in the clan. The updated position is found to be
$${E}_{new,{c}_{i,j}}={E}_{{c}_{i,j}}+\vartheta *\left({E}_{best,{c}_{i}}- {E}_{{c}_{i,j}}\right)*u$$
7
Where \({E}_{new,{c}_{i,j}}\) denote new position of elephant j in clan i, \({E}_{{c}_{i,j }}\) symbolize the former status of Elephant J in the clan i, \({E}_{best,{c}_{i}}\) represent the position that fits best (matriarch), \(\vartheta\) 𝜖 [0,1] is parameter, u 𝜖 [0,1] are the algorithm's random number is used in its latter phases to increase population variety. Position update for the clan's ideal candidate\({E}_{best,{c}_{i}}\)is estimated as:
$${E}_{new,{c}_{i}}={\mu *E}_{{center, c}_{i}}$$
8
$${E}_{{center, c}_{i}}= \frac{1}{{n}_{{c}_{i}}} {\sum }_{j=1}^{{n}_{{c}_{i}}}{E}_{{c}_{i,j,d }}$$
9
𝛍 𝜀 [0,1] is the algorithm's second parameter, and \({E}_{{center, c}_{i}}\), represents the clan's geographic centre; the number of elephants in the clan is ci, \({n}_{{c}_{i}}\).
5: Operate a clan separation. According to the following equation, a certain number of the elephants in each clan i with the worst value are relocated to the new position,
$${E}_{worst,{c}_{i}}={E}_{min}+\left({E}_{max}- {E}_{min}+1\right)*rand$$
10
Where rand ε [0,1], \({E}_{min}\)denotes the search space lower bound, \({E}_{max}\)denotes the search space upper bound.
6: Assess the population using the most recent position.
7: Until the stop criterion, that is.
8: Provide the population's top response.
The fittest elephant in the clan can be described as:
E new,ci,t = γ ∗ Ecenter,ci (11)
E new,ci,t is obtained from the information of all the elephants present in clan pk.
γ ∈ [0, 1] = determines, how much Enew,ci,t is affected by Ecenter,ci.
E center,ci = centre of clan ci and can be obtained by the given equation for dth dimension. Here ‘d’ shows the dth dimension i.e. 1 ≤ d ≤ D, where D shows the total dimension.
FF(E) = (𝜎 *(F Td + FTdc+FTdh) + 𝛃 *( FIp+FNv + FVm) + 𝛛 *( FTe+FTeh+FTec) ) / (𝜎 + 𝛃 + 𝛛) … (12)
Where (, 𝛃, 𝛛) ∈ [0, 1].
4.7 Reformed RNN (R-RNN)
In essence, an RNN is a type of neural network. The output from the phase before is being inputted into this step. The primary and most important characteristic of RNNs is their hidden state, which retains some information about a sequence. In conventional neural networks, each Hidden Layer (HL) includes its own set of weights and biases.
Higher exploitation ability is therefore achieved because to the updating method of enhanced elephant hearing optimization. The BF movement comes to a halt if the termination standard is met. The number of achieved iterations serves as the stopping condition. The algorithm produces the optimal result based on the fitness levels. The chosen features from EHO module S(FF) are therefore mathematically demonstrated as,
S(FF) = {α1, α2, α3,…, αt} (13)
This section seeks to define the problem statement's constituent parts, first with relation to the churn indicator and then with regard to the loss function. Churn, as used in business, is the regular loss of clients who cease all activity for a sufficient amount of time. Depending on the industry, this time frame can be picked at random. The procedures are the same as for RNN [29]. We use the model of the aforementioned collection of well-chosen features as an input of trainable parameters in place of the number of players as a feature set.
The R-RNN is employed on the entered data Ip = {α1, α2, α3…,αt} that encompasses of a hidden vector sequence ĥhid = {ĥ1, ĥ2, ĥ3,. .. .. ., ĥt} and the output vector sequence oop = {o1, o2, o3,…, on}by way of iterating the sequence (as t = 1 to T) in subsequent manner. The HL can well be gauged as,
ĥhid = ∂act[Wαĥαt + Wĥĥ ĥt−1+Ba] (14)
Where, if Wi terms signifies the weight matrices, then Wαĥ is the weight matrix of input in hidden layer), the Ba terms imply bias vectors and∂act indicates the hidden layer AF, which is calculated using the stated swish function by,
f(X) = X ∗ Sigmoid(λX) (15)
The model's trainable parameter, λ is stated in this sentence.
The output layer is then used to render the outcome (OL). The sigmoid AF turned on the OL. Additionally, it is possible to determine the OL by,
o t = σS[Wαoĥt + Ba] (16)
α = Wαoĥt + Ba (17)
Next, the sigmoid AF is gauged utilising the following relation,
σ S (α) = 1 / (1 + ε−α ) (18)
To calculate the loss value, the difference between the actual value (α) and the expected value (ά) is computed. The error value is easily measurable as, Err = (α − ά)2.
If Err = 0, the model provides the precise answer. Back-propagation is carried out by changing the weight values if the error value Err ≠ 0. Finally, the classification technique effectively predicts the CC without any misclassification errors.
4.8 Churn prediction
According to the R-RNN classifier, the "2" form of the final outcome is CC and non-CC.
-
Non-churn customer: Customer who is willing to stick with the same telecommunications network is known as a non-churn customer.
-
Churn customer: Customer who is willing to stick to another telecommunications network is known as a churn.
4.9 Retention process
If the result is a CC, the specific customer's network usage history is checked. The equivalent threshold values are set with their network utilisation as the focal point. If the customer's network usage remains high, or the threshold value is higher, the customer retention process is carried out. Customer retention refers to the method of keeping current customers and existing customers on the same network by making a few alluring offers and forbidding them from switching to any other telecommunication networks. In contrast, if a consumer only uses a small portion of the network, or if their network utilisation is below the threshold, they are ignored.