Probability Bankruptcy Using Support Vector Regression

: This paper aims to investigate probability bankruptcy using Support Vector Regression. There is 7 Variable for period 2016 to 2018 and 17 companies. The result of Support Vector Regression has good model because it has highest of coefficient of determination.


Introduction
Bankruptcy is still a hot topic in economics research and practice. Bankruptcy is a situation that a company has stopped its operations. The company could be a bankruptcy company that will affect by internal factor and external factor. Internal factor showed by the performance which indicate financial performance firm to become worst from the previous performance. Management of the company wants to know how the performance lead to company bankruptcy. For this reason, probability bankruptcy becomes important to the management of company. Management needs information of Probability bankruptcy for decision making to do future strategic management.
At the beginning, bankruptcy is started by Beaver (1966) and Altman (1968) to classify company bankrupt or non-bankrupt. Beaver used univariate analysis and Altman did further to use discriminant analysis. Their model is only to classify the company to become bankrupt and non-bankrupt, based on variable selection using the model. Management needs a model that could anticipate the firm performance which is called probability bankruptcy. Then, Merton (1974) introduced a model which could predict a bankruptcy company, which is called Merton Model. This Merton Model is created using adjusted to Black-Scholes Model (1973). Furthermore, Ohlson (1980) introduced probability of bankruptcy using Logit Model and Scott also. In 1990, the discussion of Bankruptcy become very hot, because researchers and academicians discuss bankruptcy using topics in computer science. Odom andSharda (1990), Fletcher andGoss (1993) and Wilson and Sharda (1994) were pioneers in this method for probability bankruptcy.
Research on Probability bankruptcy is still very limited, because this research combined computer science, statistics and finance. Scott (1981) discussed about theory of bankruptcy and methodology including prediction. Tudela and Young (2003) used Merton Model to predict bankruptcy companies in UK. Jones and Hensher (2004) introduced Mixed and Multinomial Logit. Ewerthz (2019) used Logit model to predict bankruptcy in Swedish. Pranowo et.al (2010) used Logistic model to predict company financial distress in Indonesia. Kim and Gu (2010) used Logistic Model to predict Bankruptcy in Hospital Industry. Ahmadi et. al (2012) used Logistic model for bankruptcy prediction in Iran. Odom andSharda (1990), Fletcher andGoss (1993) and Wilson and Sharda (1994), Zang et.al (1999), Charalambous et.al (2000, Virag and Kristof (2005), Shin and Lee (2004), and Bredart (2014) used artificial neural networks in bankruptcy prediction. Gaganis et.al (2007) used Neural Network for Audit Opinion.
Prediction to probability of bankruptcy mostly used financial ratio of company since Beaver (1968b), Altman (1968) and Ohlson (1980) and the end of 1990 by Fletcher Goss (1993) and Odom and Sharda (1990), Wilson and Sharda (1994) and Tian (2017) and Mayliza et.al (2020). Aziz et.al (1988) and Gentry et.al (1985) used cashflow to predict bankruptcy firm. Nouri and Soultani (2016) and Manurung et.al (2020) prediction bankruptcy using macroeconomics variable beside financial ratio. There is very limitation to use macroeconomic variable for prediction of company bankruptcy.
Previous explanation, Merton Model, Logit Model and Neural Network has been used to predict probability bankruptcy. Any other method could be used to predict probability bankruptcy. This research tried to explore Support Vector Regression (SVR) Machine to predict probability bankruptcy. SVR is a member of family of machine learning. This research used financial ratio and macroeconomic variable as independent variables to predict probability bankruptcy.

Proposed Methods
Probability bankruptcy is started by Merton (1974) using adjusted to Black-Scholes Model (1973). Merton used Black Scholes model to calculate default distance which is probability bankruptcy. In 1980, Ohlson published a paper about probability bankruptcy using Logistic model, which is known Logit Model. Ohlson (1980) used Logit Model to predict a company bankrupt or non-bankrupt. Crosbie and Bohn (2003) also introduce measuring probability bankruptcy which also using adjusted Black Scholes Model. Manurung et.al (2020) used Model panel data to predict probability bankruptcy. This research uses Support Vector Regression (SVR) which is originally developed from Support Vector Machines (SVMs).

Support Vector Machines (SVMs)
It is known that Support Vector Machines (SVMs) are used and performed well in many classification tasks. It is considered as the best approach for supervised learning problems. The idea comes from how to initiate a learning method for twogroup classification problems (Cortes & Vapnik, 1995). In short and simple definition, SVMs attempt to look for the best hyperplane in separating two classes in particular input space. In that matter, SVMs can embed current data into a higher dimensional space; it is known as kernel trick. Figure 1 gives the illustration how kernel trick works.

Figure 1. Kernel tricks in Support Vector Machines.
The main insight of SVMs is that some samples have more importance than the other. To pay attention to that samples can lead to a better generalization. Instead of minimizing expected empirical loss on the training data, SVMs attempt to minimize expected generalization loss. SVMs maximize the margin around the separating hyperplane. The decision function is fully specified by a subset of training samples, the support vectors.
SVMs arrived in good performance in classifying such classes. Such research work has previously proven its effectiveness in the speech emotion recognition field. The SVM-based model successfully achieved an impressive 99.64% classification accuracy for the happiness emotion from the speech (Jain et al., 2020). SVM has also been implemented in the intrusion detection system by Al-Yaseen et al. (2017). Their SVM -ELM (Extreme Learning Machine) hybrid model achieved a high accuracy of 95.75% in attack detection, proving its effectiveness of implementation. In term of classifying personality from Facebook users, Tandera et al. (2017) has proven that SVM achieved the highest performance among all machine learning algorithms.

Support Vector Regression (SVR)
While discussing about its utilization for regression method, Support Vector Machines (SVMs) have same principles as they have in classification, yet there are only few differences in between. Support Vector Regression (SVR) is a model derived from the SVM. Essentially, regression task is quite similar with classification task. The objective of SVM model is to manage a plane so that the support vectors from both classification sets are farthest from the classification plane, and the objective of SVR model is to find a regression plane so that all data will be closest to the plane (Wang et al., 2020). After all those steps are followed, SVR model has been ready to predict unlabeled values.
SVR has also been used in other research and proven to be successful in many tasks. In predicting travel-time, Wu, Ho, and Lee (2004), attempted to apply SVR.
From several experiments, SVR performs very well for traffic data analysis. In other field of study, which is forecasting, SVR is adopted to produce real-time flood stage forecasting (Yu, Chen, and Chang, 2006). This research results said that SVR can predict flood stage forecasts 1 to 6 hours ahead effectively. There are still many other cross-disciplines research that utilize SVR for the main techniques and achieve good results such as: complex engineering analysis (Clarke, Griebsch and Simpson, 2005), forecasting tourism demand (Chen and Wang, 2007), electric load forecasting (Elattar, Goulermas, and Hu, 2010), electricity consumption (Kavaklioglu, 2011), software project effort (Oliveira, 2006), and even for objective image quality assessment (Narwaria and Lin, 2010).

Experimental Results
We utilized SVM model from scikit-learn library (Pedregosa et. al., 2011). Source code, binaries, and documentation of scikit-learn can be downloaded from the following link: http://scikit-learn.sourceforge.net.
This research used data of Indonesia's coal mining firm to predict probability bankruptcy. The dataset consists of 51 sets of data from 17 different companies within the time range 2016, 2017, and 2018. The dataset is first normalized within its maximum and minimum values before being split into training set and testing set. The training set consists of 80% of the data, while the testing set is comprised of the remaining 20%. In this research, there are 7 variables inputs that are be trained: In the training process, the model implements k-fold cross validation, which again splits the training data into k numbers of data splits. In k number of iterations, one of the splits is chosen to be the validation set, while the rest of the fragments are utilized as the real training set for the iteration. In this model training, we used a kvalue of 5.
Evaluation methods consist of MAE (Mean Absolute Error), MSE (Mean Squared Error) and R 2 . The higher the MAE's score, the worse the system, as well as MSE.
While the higher the R 2 's score, the better the system. To maximize the SVR models' performance, the parameters must be tuned to suit the data characteristics. In our approach, we utilized the automated parameter tuning from scikit-learn to tune the values of C and gamma in the SVR model. The C parameter represents the regularization parameter of the model, while gamma parameter represents the kernel coefficient of the model. The parameter tuning is performed multiple times to discover ideal different parameters in three target scenarios for refit, which are: 1.
Achieving the minimum MAE.
Additionally, we included the default scenario, which comes with the scikit-learn library. The result parameter from the tuning, C and Gamma, may be observed in Table 1. The mentioned scenarios are then trained and evaluated according to the described process above. The results of the validation and evaluation phase of the scenarios are described in Table 2.  figure 3. Compared to the other scenarios, the minimum MAE is achieved by Scenario 3. Although Scenario 3 is the tuning with R 2 refit, it managed to outperform the MAE refit tuning of Scenario 1 with a close difference of 0.006. Scenario 2, however, does not perform well with MSE refit tuning. Although it achieves the least MSE in the K-fold cross validation, it is outperformed by all other scenarios in the evaluation phase. Minimum MSE is achieved by Scenario 1 instead.

Figure 3. MAE and MSE Results of Evaluation
Even though Scenario 1 excels at its MSE results with a close difference of 0.0022, the R 2 results of the scenarios prove Scenario 3 to be better. The model in Scenario 3 outperformed all the other models and achieved an astounding R 2 results of 0.5014. The R 2 results of the model may be observed in figure 4. The R 2 metric of these models clearly describe the performance gap between Scenario 1 and Scenario 3 which could not be expressed by the other metrics such as MSE and MAE. Therefore, it can be concluded that the best SVR model for this field is the model from Scenario 3 with parameters C=100000 and Gamma=0. 0000001.
This research find that Support Vector Regression (SVR) machine is the best method to predict Probability Bankruptcy. R 2 is evaluation metrics to see the strong relationship (Moksony, 1990). Korn and Simon (1991) stated that R 2 says Explained Residual Variation, Explained Risk, and Goodness of Fit. This research used R 2 to state it, because R 2 stated this method has result of R 2 of 50.14%. If we want to get coefficient of correlation, the value is more than 0.7. It means that there is strong relationship between independent and dependent variable (Humble, 2020). This figure looks like that the model has weakness to model fit because its figure far to perfect value of 1. Roll (1988) stated that mostly R 2 in finance below value of 0.5.

Conclusion
In contributing to the solution of this issue, we propose an SVR machine learning model to predict the bankruptcy probability of a company. After done with parameter tuning, it is found that the SVR model performs the best with a custom parameter of C=100000 and Gamma=0.0000001. With these settings, the model successfully predicts the bankruptcy probability with an R 2 score of 0.5014. The performance may hypothetically be increased with provision of more data and a more sophisticated machine learning model or its ensemble counterparts.