Preterm birth (PTB), a leading cause of neonatal mortality, presents challenges in prediction with current methods focusing on traditional risk factors or biomarkers, particularly for spontaneous preterm birth (sPTB). Here, we developed a machine-learning framework based on untargeted cervicovaginal fluid proteomics from 707 individuals (136 sPTB and 571 FTB, Full-term birth) across five hospitals. Our analysis identified 293 proteins with potential predictive value, subsequently distilled into a robust predictive model composed of five protein markers (LUM, AMBP, B2M, FN1, and TIMP1) demonstrating outstanding performance in predicting the risk of sPTB. It achieved area under the receiver operating characteristic curve values of 0.89, 0.90, 0.92, and 0.94 in independent validation datasets from four hospitals, with an overall sensitivity = 0.73 and specificity = 0.92. Moreover, we reproduced the predictive performance of these five biomarkers using enzyme-linked immunosorbent assay (ELISA). At system level, integrating proteomics data from all hospitals, we delineated two distinct molecular subtypes of sPTB, revealing functional differences. Concurrently, we conducted a drug target analysis, unveiling potential preventative drugs for sPTB. In summary, this research expands the repertoire of sPTB prediction biomarkers, offers a novel non-invasive predictive model for sPTB and provides valuable data resources for further investigation into the pathogenesis of preterm birth.