Artificial intelligence (AI) is another example of a rapidly evolving technology that is being used in insurance innovation for a variety of back-end reasons, including fraud detection, algorithmic trading, blockchain analytics, and financial search engines [6]. Machine learning is also helping to develop Natural Language Processing (NLP), robotics, and computer vision. Thanks to these amazing applications, machine learning has sparked a lot of interest in the insurance industry, which possesses a lot of data. Most insurance applications employ machine learning methods such as Logistic Regression with Penalty, Neural Network, Extra Trees Classifier, Random Forest, SVM, and GBM (see Fig. 4).
The three types of learning in AI are: supervised learning, unsupervised learning, and reinforcement learning. Most insurers' researchers have utilised supervised learning to estimate risk using known variables in different permutations to produce the desired output for the past few decades. Insurers of today are encouraged to engage in unsupervised learning, which has well defined objectives. If any changes are made to the variables, the method detects them and attempts to update them in accordance with the aims.
The most major benefit of AI in the insurance industry is that it makes data sets easier to manage. Machine learning can be used to organised, semi structured, and unstructured datasets with success. Different insurance organisations provide datasets for data analysts and researchers used (see Table 1). By using superior predictive accuracy, machine learning may be applied across the insurance sector to identify risk, claims, and customer behaviours.
Artificial intelligence has also been used to power conversation interfaces in which businesses use existing data, machine learning, and natural language processing to intelligently offer clients various types of information. Chatbots are fed natural language data from previous customer encounters, which is processed by an intelligent system that learns to react to users in textual form instantly.
Artificial intelligence could be used in a variety of ways in the insurance industry, from responsive underwriting and premium leakage to expenditure control, arbitration, litigation, and fraud detection. A lot of work is being done to overcome this problem by integrating powerful artificial intelligence techniques to insurance data.
Driven by industrial production for management solutions and educational ability for creating highly relevant machine learning techniques, a substantial number of scientists are examining advanced machine learning approach for responsibilities including premium leakage to expenditure management, debt recovery, proceedings, and fraud detection. The amount of data available to answer specific insurance issues is steadily increasing.
1.2.1 Overview of AI techniques
The purpose of this section is to provide a formal introductory definition of the AI techniques studied in this research, such as Multilayer Perceptron (MLP), AdaBoost (Adaptive Boosting), Support-vector machines (SVM), Exponential smoothing, Linear Regression, Naive Bayes, Multiple Linear Regression (MLR), XGBoost, Random Forests, Logistic Regression, J48, Classification and Regression Trees (CART), REPTree and Decision Trees.
Multilayer perceptron (MLP)
A A multilayer perceptron (MLP) is a feedforward artificial neural network that is a form of feedforward artificial neural network (ANN). The acronym "MLP" is a bit of a misnomer because it can refer to any feedforward ANN or networks made up of many layers of perceptrons. Multilayer perceptrons, especially most with a hidden layer, are referred to as "vanilla" neural networks. A multilayer perceptron consists of input and output layers as well as one or more hidden layers with many neurons layered together. While neurons in a perceptron must have a threshold-enforcing activation function like sigmoid or ReLU, neurons in a multilayer perceptron can have any activation function they like [25].
AdaBoost
AdaBoost (Adaptive Boosting) is a type of ensemble learning that was created to help binary classifiers perform better. AdaBoost uses an iterative approach to improve bad classifiers by learning from them [26]. In 2003, the Gödel Prize was awarded to Yoav Freund & Robert Schapire for their work on the classifier model meta-algorithm. It can be combined with a multitude of many other learning techniques to improve outcomes.
Exponential smoothing
Exponential smoothing is a single-variable time series forecasting technique that can be used to data with a recurring trend or a linear model. It's a robust forecasting technique that can replace the well-known Box-Jenkins ARIMA learning process. The exponential window function is used in exponential smoothing, which is a common technique for smoothing time series data. Recent data are weighted identically in the basic average line, whereas exponential functions are utilised to impart exponentially decreasing weights over time in the advanced average line. It's a straightforward technique for determining anything depending on the user's past expectations, such as seasonality. Exponential smoothing is extensively employed in time-series data analysis [27].
J48 algorithm
The J48 algorithm is among the most powerful machine learning techniques for classifying and analysing data in real time. It takes up more space and decreases the precision and speed in which it classifies health data when used, for example [24].
Naive Bayes
The Naive Bayes classifier is a straightforward "probabilistic classifier" based on Bayes' theorem and strong (naive) assumptions of independence. They are among the most fundamental Bayesian inference methods, but when combined with kernel density estimation, they can achieve high levels of accuracy [25, 28, 29]. A linear number of variables in relation to the amount of parameters in a learning problem is required by Naive Bayes classifiers. Maximum-likelihood training, unlike many other classification methods, is done by evaluating the current closed-form expression, which requires linear time, instead of iterative estimation, which takes time.
XGBoost
For boosted tree algorithms, XGBoost is a scalable and extremely accurate form of XGboost that needs a lot of computational power. It was created primarily to improve the accuracy and processing capability of machine learning models. In contrast to gradient boosting, which employs a second-order Taylor model in the loss function to link to the Newton Raphson approach [31], XGBoost uses a second-order Taylor model in the loss function to connect to the Newton Raphson technique.
Linear regression
Linear regression is sometimes known as a linear model. In a linear model, the input variables (x) and one output parameter (y) are assumed to have a linear relationship (y). The fact that y can be determined using a linear function of the input values is more particular (x). A linear technique for modelling the relationship between a scalar outcome and one or more explanatory factors is linear regression (also known as dependent and independent variables). Simple linear regression is used because there is only one influential factor; multiple linear regression is used when there are many explanatory variables. Multivariate linear regression, in contrast to a single scalar variable, predicts a number of connected dependent variables [32, 33].
Random forests
This is also known as random choice forests, are a type of ensemble learning method used to solve issues like regression and other problems that need the training of a large number of decision trees. For classification tasks, the random forest's output is the class chosen by the majority of trees. The mean, or mean prediction, of the different trees is returned for regression tasks. A random forest is a supervised classification approach that makes use of a huge number of decision trees to classify data. It employs grading and feature randomization to produce a random variable forest of trees with a more reliable aggregate prediction than any single tree [34, 35].
Logistic regression
A logistic regression, often known as a logit model, is a statistical technique for predicting the likelihood of a given class or event, such as pass/fail, win/lose, alive/dead, or healthy/sick, occurring. This can be used to represent a wide range of events, such as determining whether or not an image contains a cat, dog, lion, or other animal. Logistic regression is the method of estimating the parameters of a logistic model. A binary logistic regression has an explanatory variables with two alternative values, which including pass or fail, which is denoted mathematically by an output parameter with two values denoted "0" and "1."
Decision tree
A decision tree is a tool that uses a tree-like model of options and their possible outcomes, such as chance event outcomes, resource costs, and utility, to make decisions. It's one way of demonstrating a conditional control method [37]. Decision trees are a non-parametric supervised learning classification technique and regression. The goal is to learn basic decision principles from data features and build an algorithm to determine the value of a target variable. The piecewise constant is approximated by a tree.
Support Vector Machines (SVM)
Support vector machines [38] are supervised learning techniques for data classification and regression analysis. A support vector machine (SVM) is a supervised machine learning approach that categorises fresh text using classification techniques after being provided with sets of annotated training data for each category.
Classification and Regression Trees (CART)
The Classification and Regression Trees approach is a classification technique that forms a decision tree using Gini's impurity index as a separate cluster. A CART is a tree structure in which each node is continually split into two subtree [39].
Multiple linear regression (MLR)
Multiple regression, or multi-linear regression, is a statistical method that predicts the outcome of a dependent variables by combining multiple explanatory variables. Multiple regression is a type of linear regression that uses only one explanatory variable [38].
REPTree
The REPTree method is a powerful decision tree classifier that builds on the C4.5 methodology known as "continuous outcome" to build classifications and regression trees. It builds a regression or decision tree using the Gini index and deviation and prunes it using diminished pruned [39].