Spectrum sensing to detect n-number of primary users using n-number secondary users by applying support vector machine.

: In this paper, a cooperative spectrum sensing (CSS) model is proposed to sense n-number of primary users (PUs) using n-number secondary users (SUs) in a sequence by applying support vector machine (SVM) algorithm using three different kernels namely linear, polynomial and radial basis function (RBF) respectively. In this method, fusion centre (FC) instructs all the SUs through control channel, which PU is to be sensed by sending a pre-defined primary user identification code (PU id ) and each SU sense the K th PU spectrum information and stored in a database at FC. SU transmits a bit ‘0’ or bit ‘1’ along with PU sensing information to the FC to indicate whether it needs a spectrum band to transmit the data or not. SU add two identification codes along with sensing information to the FC which indicates that from which SU the sensing information received and which PU is sensed by the SU. For simulation 500 data samples are used and t he simulation results show an accuracy of 96% and false alarm value of 1.3% in classifying the SU sensing information at FC using RBF kernel. Another method is proposed with multiclass classification by applying SVM algorithm using RBF kernel. The confusion region class is classified with zero false alarm percentage and achieves an accuracy of 99.3% in classifying the SU sensing information at FC.


Introduction
In the late nineteenth century and at the beginning of twenty century major innovations in the field of telecommunication domain made a revolutionary change in this field. Placing satellites in the orbit, wireless cellular phones, major developments in the field of optical domain had taken telecommunication field to the next generation. Major research is focused on effective utilization of bandwidth, and techniques to access the available spectrum by large number of users. Digital modulation techniques like Quadrature Phase Shift Keying (QPSK), Gaussian Minimum Shift Keying (GMSK) and multiplexing techniques like Orthogonal Frequency Division Multiplexing (OFDM) improved the throughput data and accommodation of large number of users in a single channel respectively. Electromagnetic spectrum is a natural resource it cannot be created nor destroyed, so countries having huge population like India have to use the available spectrum resource in an effective way. In the year 2002, Federal Communication Commission (FCC), in the United States prepared a report [1] on utility of the available spectrum and published by Spectrum Policy Task Force, shows majority of spectrum band remains unutilized shown in Figure 1.
Innovation of smart phones and Android Operating System (OS) made cellular phones to share information through various apps or applications than using for voice call. Information sharing can be a data, image, online streaming of video or ondemand entertainment channels etc., this made large bandwidth requirement. A report by the Telecom Regulation Authority of India (TRAI) [2], stated that there are more than 500 million mobile phone users in India and in future this number may increase. Technologies available at present may not support to handle this situation so countries like India, decided to stop unutilized services like telegraph from July 15, 2013 to meet out the spectrum demand. Further, researchers have been focusing towards developing techniques to utilize the unused spectrum effectively. In 1999, Mitola discussed the concept of the "pooled radio spectrum" [3] and in 2005, Haykin, [4] coined the phrase "spectrum hole" or white space which can be used by unlicensed users without affecting the licensed user and proposed the challenges that may arise in the realisation of this model. A decade of research, proposed several methods for spectrum sensing have been reviewed in [5,6], includes different types of narrow band sensing, wideband sensing, compressive sensing its performance measurement, applications with advantages and disadvantages. In addition, the hardware challenges and different standards employed in sensing have also been discussed. In this paper, a cooperative spectrum sensing (CSS) model is proposed, with a fusion centre (FC) to detect the spectrum hole, when n-number of secondary users (SUs) trying to sense a n-number of primary users (PUs). A supervised machine learning (ML) algorithm namely, support vector machine (SVM), is used with 500 data samples taken as reference with "The Mobile Phone Activity Dataset" [7] to detect the spectrum hole. Finally, the SUs sensing information is classified at the FC, based on the kernel applied using SVM algorithm to identify the number of SUs sensed information having vacant spectrum band or it is occupied by PU. The remaining of this paper is organised as follows: Section 2 provides an overview of the existing CSS methods and ML techniques used in spectrum sensing. It also describes the proposed CSS model, which is based on SVM algorithm. Section 3 proposes a multiclass classification using SVM to overcome multi-threshold problem in spectrum sensing. Section 4 compares the performances of the two proposed models discussed in previous chapters and Section 5 presents the simulation results along with the performance metric curves. Finally, Section 6 presents the conclusion of the work.

Framework of CSS model
A conventional CSS model [8] is shown in Figure 2. The model has one PU transmitter (PUTX), one PU receiver (PURX), one or more SUs or Cognitive Radio (CR) users used to sense the PUTX, and a FC to receive the sensing information from all SUs. The CR users sense the transmission from the PUTX at every instant of time and pass this information to the FC. The FC receives the sensing information from all the CR users through the control channel and apply mathematical model to determines the presence or absence of PUTX. This information is shared to all SUs and FC will decide which SU is allowed to utilise the available spectrum band. This type of decision is called hard combination. In another method called soft combination where all SUs take decision by itself and convey the result to the FC. In the system model, if the transmitted signal by PUTX is considered as x(t), and the received signal at each SU is considered as y(t) with h(t) as channel gain of the sensing channel and n(t) as zero mean additive white gaussian noise. Then the system identifying the presence of spectrum hole is formulated as binary hypothesis problem as given by equation 1 as Here, H0 and H1 indicates the PUTX spectrum band is vacant and occupied by PURX, respectively.    Figure 3. This method is further improved using double threshold and eigen value based double threshold and Sevcik Fractal Dimension. The dynamic threshold method, energy-efficient spectrum sensing, and a few other advanced methods for CSS have been discussed in [9,10,11,12,13]. Previous work related to CSS using machine learning (ML) [14] where the input signal is classified into predefined classes and H0 hypothesis on the PU received signal is divided into k discrete regions by using k reliability thresholds. Residual energy of the SU node battery is classified into discrete regions based on the threshold. The result shows increase in throughput when compared to two class hypothesis model. In [15], a blind spectrum sensing scenario is performed using deep learning method. In this method three types of neural networks, namely, a convolutional neural network, a long short-term memory, a fully connected neural network is applied. Sensing was performed in the region of -9 dB to -5 dB, and the false alarm rate was 0.1. In [16], CSS based on soft combination and investigation of the effective channel gain between SU and FC is considered. Optimal weight combining is obtained from linear detection probability and to increase the sensing performance optimal power allocation is obtained by maximising the optimal weight. In [17] knearest neighbour algorithm is used for spectrum sensing. The SUs sensed the PUs under varying conditions and send a signal to the FC. FC decides the PU is having a vacant spectrum band or not based on the sensing information received from all SUs and applying a hard combination decision rule. The above-descried methods show how ML is used to sense a spectrum hole in a single PU and maximising the optimal weight increase the sensing performance.
2.2 Proposed cooperative spectrum sensing model using SVM. The proposed system model is an extension of our previous work [18]. Figure 4 contains n-number of primary user transmitter (PUs), FC and n-number of SUs. All n-number of SUs sense the k th PU for the availability of vacant spectrum band. Assume in our system model, FC is aware about the number of PUs present in the cell structure, and it is allocated with a three-bit identification code to distinguish with each PU. FC share this PU identification code (PUid) to all SUs using control channel and instruct all SUs through control channel, which PU to be sensed for vacant spectrum band. In this scenario, only a few SUs may be looking for vacant spectrum band to transmit the data and few other SUs may not require any channel due to non-availability of data to transmit or few other SUs may be utilizing other PU vacant spectrum band present in the same cell Fusion Centre channel. In this model a soft decision fusion model is performed, so all the n-number of SUs will forward the sensing information to the FC. Few SUs out of n-number of SUs, who doesn't need a spectrum band will intimate the FC by transmitting a bit '0' and if needed a bit '1' through control channel. The remaining n-number of SUs sense the k th PU spectrum band to identify whether k th PU is vacant or busy and it will transmit the sensing information to FC along with two identification codes as SUid, and PUid, along with the SUs sensing data in this way the SUs providing wrong information about PU status can be eliminated to improve the system performance. Now FC will apply SVM to all the n-SUs sensing information collected and stored as a database containing SUid, PUid, SNR, distance and Threshold as attributes. Figure 5 shows the block diagram of the proposed model, where the n-number of SUs sensing information is received by the FC at different time interval using Time Division Multiple Access (TDMA), further it is demodulated, converted from frequency domain to time domain to calculate the signal power of k th PU from various SUs transmitted information. Pre-threshold value is added to the database will get updated based on the present SU data, because signal strength values may change depending on the environmental conditions. Further SVM is applied to the database to classify how many SUs sensed information conveying that PU is vacant or busy. into two sub-slots as sensing time (st) and transmission time (tt) as shown in figure 6. Transmission time will include the three-bit PUid, so if PUid is different than the FC instructed through control channel, FC can discard this SU sensing information. Since soft decision is used in our system model, all n-number of SUs transmit the sensed information without making any local decision and FC will take the final decision of k th PU spectrum band is vacant or busy. The sensed information received at the FC from n-number of SUs is Where In the proposed system model PU spectrum may be present or absent in the received signal by the k th SU (∀ = 1,2,3, … . . ). The method is partly based on that described in [8]. The sensed signal by the n-number of SUs is given by . Where L is the total number of samples received during the sensing time st1 and is the signal to noise ratio value. The channel is assumed to have flat fading and its gain hi[t] depends on the distance between PU and each SU and it is given by ST considered as input features and it is taken as coordinates of vector x. The estimated energy Se is compared with a pre-threshold value λ, to detect the presence of spectrum hole in k th PU. The direction of the vector x=(Se,λ) that map the values of its coordinates is taken as weight vector w and b is considered as bias. In general the plane that separates the two dimensional data is y = ax+b, and taking =(x1,x2) and =(a,-1) so the equation becomes • + = 0 The plane that separates PU spectrum band is vacant or busy is formed as a hypothesis problem given by • + < 0 as class -1 where PU is busy • + ≥ 0 as class +1 where PU is vacant In order to select the best plane to separate the two classes, an optimization is done by adding a slack variable to the constraints ( • + ) ≥ 1 − where i = 1,2,….n and L1 regularisation is applied. A regularization variable R is added for optimization and the equation becomes min , , A Lagrange multiplier is applied to convert the optimization problem into a dual problem lies between 0 to R. To solve this optimization problem, consider only the product of two variables • in the above equation. The dot product is considered as kernel, which is used to classify the sensed signal data transmitted by each SUs. In the proposed model linear, polynomial and Radial Basis Function (RBF) kernel are applied to the SUs transmitted information to classify the PU spectrum is vacant or busy. Linear kernel is the simplest kernel function and the inner product is given by ( • ) = + Polynomial kernel function is given by ( • ) = ( • + ) RBF kernel function is given by 3. Multi-class classification using support vector machine.
The proposed CSS system model is shown in figure  7, where n-number of PU, n-number of SUs and a FC is present. All SUs sense k th PU for a vacant spectrum band and transmit the sensed information to the FC through a dedicated channel. FC receive all sensed information from n-number of SUs at different timeslots using Time Division Multiple Access (TDMA), it is further demodulated and converted to time domain using Inverse Fourier Transform (IFT) to estimate the signal power information transmitted by each SUs. All SUs sensed information is stored in a database as shown in figure 8. The multiclass prethreshold value is updated based on the present dataset values this would increase the detection accuracy. Different SUs is located at different locations in a sensing zone, the distance between each SU with k th PU may vary. Those SUs which are located at farthest distance from k th PU may have weak signal strength and the sensed information may be false due to fading. FC have to decide whether the k th PU, have a vacant spectrum band, from the n-number SUs sensed information. FC apply SVM to all the n-number of SUs sensed dataset to classify the SUs sensed information into three different class whether the PU have vacant spectrum band, number of SUs conveying PU is busy and few PUs SNR range value falls very close to the threshold region so it's difficult to decide due to channel fading and the number of SU may misguide the FC due to non-availability of data to transmit. Multiclass SVM use one-against-all

Performance analysis of proposed methods
4.1 Spectrum sensing using SVM. ML is applied to predict the future output from the collected dataset, which will make statistical decision making and used to solve data mining problems. In this paper a supervised ML algorithm called SVM is applied, where the dataset is divided into training and test set to predict the future values of the system as shown in figure 9. A mobile data set [19] is used as a reference to analyse our system performance model. The transmitted signal from the k th PU is sensed by the n-number of SUs as discussed in chapter 2.2. and the signal power is calculated at the FC and stored in a database. The attributes considered to classify the SUs are signal to noise ratio (SNR), PUid, SUid the pre-threshold value is updated from the current database and applied to SVM algorithm.  These four different categories form a confusion matrix as given in Table 1. In the proposed method FC have to decide whether PU spectrum band is occupied or vacant. So in the confusion matrix TP is considered as PU busy and TN region is considered as PU vacant. FP region indicates that those PU who is actually busy is wrongly predicted as vacant and FN region indicates PU actually vacant is predicted as busy. In this method FC has to identify number of SU sensed signal conveys PU is busy, so TP and FP region is given importance in the confusion matrix, since if a PU is vacant is wrongly predicted as busy it will not affect the system performance but if a PU is actually occupied but wrongly predicted as vacant by FC and if the SU starts transmitting the information then the PU system performance will be affected. Table 2 gives the confusion matrix formed using linear kernel and Table 3 gives the false alarm rate and percentage of detection using linear kernel. Similarly, Table 4 and Table 6 gives the confusion matrix formed using polynomial kernel and RBF kernel followed by Table 5 and Table 7 gives the false alarm rate and percentage of detection using polynomial kernel and RBF kernel respectively.    4.2 Performance analysis of multiclass classification using SVM. The system model shown in chapter 3, form a multiclass classification problem applying SVM algorithm using RBF kernel to identify the number of SUs sensed information conveying into three classes as PU busy, confusion region and PU vacant. Here a new class called confusion region is introduced where the PU SNR value is very close to the threshold value or the SNR of PU is equal to threshold value it is classified into confusion region, in this way the PU is actually busy is wrongly predicted as vacant which increases the percentage of false alarm can be eliminated. Table  8 gives the confusion matrix of (3,3) for a threeclass classification problem, in which if TP is considered as PU busy, then matrix position (1,1) is taken as TP value, TN values are in the matrix position (2,2), (2,3), (3,2) and (3,3), FN region indicating PU actually busy is wrongly predicted as vacant is given by matrix position (2,1) and (3,1), and FP region indicating PU actually vacant is wrongly predicted as busy is given by matrix positions (1,2) and (1,3) respectively. In this classification method as it is mentioned in chapter 4.1, FC ability to classify PU busy as TP region and FN region where it is actually busy, but wrongly predicted as vacant which will affect the system performance with increasing the false alarm percentage. Therefore, TP and TN are given higher priority than FP and FN region.   TN   1  10  7  3  0  1 --2   2  20  14  6  0  3  --3   3  30  21  9  0  3  --6   4  40  28  12  0  3  --9   5  50  35  15  0  5  --10   6  60  42  18  0  7  --11   7  70  49  21  1  10 1  -10   8  80  56  24  1  7  1  -16   9  90  63  27  0  7  --20   10  100  70  30  1  11 -1 18   11  150  105  45  1 Table 9, Table 11 and Table 13 shows the confusion matrix formed using multiclass SVM algorithm with RBF kernel for three different multiclass regions namely PU busy class, confusion region, PU vacant class. Table 10, Table  12, Table 14 gives the percentage of false alarm and percentage of detection for the multiclass regions. From Table 13 the objective to classify the confusion region class without any false alarm is achieved since the value obtained is nil and the other two class regions may have few percentages of false alarm but the purpose of multiclass classification is to avoid the PU whose SNR ranges fall near to the threshold value and considering those PUs to detect the vacant spectrum band may increase the false alarm rate. So, confusion region is classified without any false alarm indicates that system performance is improved by applying multiclass classification.

Simulation results
In this section, the proposed method of CSS when n-number of SUs sense the k th PU for vacant channel and transmit the information to FC where SVM algorithm is used to classify the SUs sensed data to identify the k th PU is vacant or busy is simulated using python programming and the graphs of the performance metrics mentioned in [20], is discussed here.

Results using SVM algorithm
The performance of SVM algorithm, using three different kernels namely linear, polynomial and RBF to classify the n-SUs sensed information at FC is processed using training and test data samples ranging from 10 to 500. Test data simulation results alone discussed in this section. Data samples of 200,400 and 500 with corresponding test data samples of 50, 100 and 125 for all the three different kernels applied using python programming to classify the SUs information is shown from figure 10,11,12,13,14, 15,16,17 and 18.             Table 15 shows the accuracy of predicting the PU busy class using three different kernels as mentioned in chapter 2. From the three different kernels applied, even though linear kernel shows better performance but most of the time the system will not behave linearly, so RBF kernel shows better performance when compared with polynomial kernel with maximum accuracy value of 96% for 300 data samples with corresponding 75 test data samples. Figure 19 shows percentage of accuracy in classifying PU busy class by three different kernels. Graph shows few test samples are classified with 100% accuracy for both linear and RBF kernel and when test samples are increased to 125 samples, accuracy value is reduced from 96% to 93.6% using RBF kernel. Figure 20 shows percentage of false alarm using three different kernels, few test samples are classified without any misclassification using RBF kernel and for 75 test samples, a false alarm value of 1.3% is obtained and increased to 5.6% for 125 test samples.

Results using multiclass SVM algorithm
The proposed method in chapter 3 using multiclass classification is implemented using python programming and the accuracy of predicting PU busy class, confusion region and PU vacant class is shown in Table 16. Result shows using RBF kernel a maximum accuracy of 98.6% is achieved to classify PU busy class whereas confusion region and PU vacant class is classified with a maximum accuracy of 99.3% for both having 500 data samples with corresponding test data of 150 samples respectively. PU confusion region has a minimum accuracy of 95.2% for 70 data samples with corresponding 21 test data whereas the PU vacant class has a minimum accuracy of 96.6% for 100 data samples with corresponding test samples of 30.

Conclusion
The proposed CSS method to sense nnumber of PUs by n-number of SUs in a sequence by applying SVM algorithm using linear, polynomial and RBF kernel. Simulation results show that RBF kernel has achieved 93.6% accuracy for a maximum of 500 data samples with corresponding 125 test data samples. Even though few data samples have 100% accuracy but a maximum of 96% accuracy is achieved for 300 data samples with corresponding 75 test data samples. By increasing the number of data samples beyond 300 accuracy in classifying the SU sensing information falls to 93.6%. Another proposed method with multiclass classification by applying SVM algorithm using RBF kernel has achieved zero false alarm percentage for the confusion region class. This indicates that by using multiclass classification, those PUs whose SNR values very close to the threshold value is classified 100% which will increase the system performance. The percentage of accuracy for PU busy class is 98.6%, confusion region class is 99.3% and PU vacant class is 99.3%. This work can be extended to scheduling the identified spectrum hole from different PUs to different SUs in a CSS system model.