The research about automatically classification and accurately identification of the stellar spectrum is one of the hot research topics in astronomy. In this paper, a high-performance BEPC-RF algorithm is proposed to
realize the task of automatic classification of stellar spectra. The algorithm can achieve dual feature extraction based on Transformer and Principal Component Analysis (PCA). The BEPC-RF algorithm consists
of the following three steps: firstly, the PCA algorithm is applied to extract the feature vector Fpca from the denoised and normalized stellar spectra; Secondly, by removing the pseudo continuum spectrum which is fitted through the polynomial fitting method, the spectral line information is obtained; In this step, the entire wavelength range is divided into 10 different bands, and then we can get 10 trapezoidal integral values of the 10 bands, and these 10 values are the vector Finte. Then, use the Transformer model to perform feature learning
on Finte to obtain another feature vector Ftrans. Finally, the feature spectrum is represented by Fspectrum=(Ftrans, Fpca), and then input it into the Random Forest (RF) algorithm to realize the automatic classification of stellar spectra. The experimental results show that the final classification accuracy rate of BEPC-RF algorithm is 0.919, which is significantly better than KLDA algorithm, KSVM algorithm, DECISION TREE algorithm, XGBOOST algorithm, ADABOOST algorithm, BAYES algorithm, KNN algorithm, RF algorithm’s classification
accuracy rate i.e. 0.8, 0.84, 0.806, 0.836, 0.83, 0.777, 0.821, 0.85.