The application of near-infrared spectrum spectroscopy (NIRS) for rapid quan titative analysis of soil total nitrogen (STN) is of great significance to recycling nitrogen in the ecosystem and crops growth. However, collecting thousands of soil samples and chemical annotation are impracticable, more importantly a deviation from NIRS advantages of rapid, costless and nondestructive. To improve the estimation performance and reduce uncertainty of the model under small sample sizes, new solutions from soil spectral data augmentation and model fusion were investigated. Specifically, 123 Latosols samples were collected, and decomposed by particle sizes to generate more data from multiple scales. Subsequently, based on the augmented data, multi-modal fusion methods were implemented. The results showed that the proposed method increased the scale of spectral data, mined more STN-related spectral information, improved estimation accuracy, and reduced uncertainty. The fusion model based on augmented data yielded optimal estimated results (root mean square error (RMSE) = 0.082g.kg−1) R2 = 0.779, and a ratio of performance to interquartile distance (RP IQ) = 3.428) on the validation set. In addition, we explored the impact of different model fusion strategies and proposed soil data augmentation method for estimating. From a 10-folds validation, the R2 cv of the weighted fusion model with using augmented data was increased by 0.285, and RP IQ was 1.015 improved than model constructed from conventional machine learning (ML) technique and pristine limited data (R2 cv = 0.432, RP IQ = 2.294). Therefore, in the scenario of using NIRS to build a rapid and accurate STN predictive models, the proposed method shows great potential in improving model reliability under small sample sizes.