For fault failures of a steam turbine occur frequently and cause huge losses, it is important to identify the fault category. A steam turbine clustering fault diagnosis method based on t-distribution stochastic neighborhood embedding (t-SNE) and extreme gradient boosting (XGBoost) is proposed. Firstly, the t-SNE algorithm is used to map high-dimensional data to low-dimensional space, and data clustering is performed in low-dimensional space. Combined with the fault records of the power plant, the fault data and health data of the clustering result are distinguished. Then, the imbalance problem in the data is processed by the synthetic minority over-sampling technique (SMOTE) algorithm to obtain the steam turbine characteristic data set with fault labels. Finally, we used the XGBoost to solve this multiclassification problem. In the experiment, the method achieved the best performance with an overall accuracy of 97% and early warning at least two hours in advance. The experimental results show that this method can effectively evaluate the state and make fault warning for power plant equipment.