The prognostic value of MRD during treatment has been demonstrated in numerous studies performed in patients with newly-diagnosed childhood and adult ALL (3, 18) or relapsed ALL (19–21), and in patients undergoing hematopoietic stem cell transplantation (22, 23). Despite these encouraging findings, MRD monitoring showed limited utility after induction in most studies, probably due to the sequence variations from the evolution of IGH clones and the immunophenotypic drift. Consequently, there is an urgent need to identify valuable predictive biomarkers from mass next-generation sequencing data for MRD monitoring. In our current study, the IGH rod-like tracer consensus sequence was extracted based on its rod-like alpha-helices structure predicted by AlphaFold2. Our findings underscored the prediction value of the novel biomarker-IGH rod-like tracer in children with pre-B-ALL, showing great potential for improving MRD monitoring.
Cross-lineage TCR gene rearrangements frequently occurred in immature B-cell malignancies, especially in pre-B-ALL (> 90% of cases) (14, 24). IGH, IGK, TRB, and TRG rearrangements were all detected in the study, and a few disease clonotypes were identified in IGK, TRB, and TRG. Focusing on the IGH clonotypes, we found that the proportion of hyperexpanded IGH could better reflect the abnormal immune cell status in children with pre-B-ALL. Since the immune repertoire of pre-B-ALL patients with multiple relapses has not been evaluated before, we focused on analyzing clonotypes among multiple relapses samples.
To further analyze the diagnostic and prognostic value of NGS-IGH in children with pre-B-ALL, we found that NGS-IGH (-) patients had a better prognosis, as seen in patients with chronic lymphocytic leukemia (25). Hyperexpanded IGH clonotypes and diversity of IGH clonotypes showed good prediction for the number of relapses and disease risk levels in children with pre-B-ALL. The ratio of IGH/TRB clonotypes also showed good prediction for disease risk level. In addition, to further elucidate the pathogenic mechanism of IGH clonotypes in the NGS-IGH (+) group, we used AlphaFold2 to predict the tertiary structures of the disease IGH clonotypes and used RMSD to analyze the similarity between NGS-IGH (+) and NGS-IGH (-) groups. To our knowledge, only one article had evaluated the CDR3 structure in B-ALL. Zha et al. (26) found that the TRB CDR3 regions contained a conserved amino acid motif with different spatial conformations in B-ALL. We predicted the high accurate structure of IGH CDR3 protein based on AlphaFold2 for the first time and found that deep learning approaches are valuable for advancing MRD detection. The structure of IGH-CDR3 was more similar within the NGS-IGH (+) group than that within the NGS-IGH (-) group, which may play the same role in tumorigenesis of pre-B-ALL; in addition, the protein structures could better reflect the role of abnormal B cells in disease compared to the amino acid sequences. Therefore, IGH rod-like tracer was defined as the class of disease IGH CDR3 coding domain with rod-like alpha-helices structure in IGH-NGS (+) samples.
We further evaluated the predictive value of the IGH rod-like tracer in the NGS-IGH (+) group. In patients for whom clonal evolution cannot be dynamically monitored by flow cytometry, qPCR, or NGS, IGH rod-like tracer in samples collected at different time points helped to distinguish abnormal B cells from the same source and was validated in published IGH sequencing data in pre-treatment pre-B-ALL patients. Thus, this study provided a new method to track pre-B-ALL patients at the molecular level. IGH CDR3 structures can be predicted and identified as IGH rod tracers by visiting the IGH rod-like tracer website (https://ai-lab.bjrz.org.cn/IR).
Furthermore, we found that the 3rd relapse sample had the highest relative abundance and the lowest diversity. The V-J gene usage and related motifs in the 3rd relapse samples differed from the 1st and 2nd relapse samples. The high-frequency use of IGHJ6 and IGHV4-34 genes was considered self-reactive (27–32), and the 9G4 antibody encoded by IGHV4-34 could bind to self-antigens and cause damage (31). GMDVW had similar motifs to ZNF24, TBX20, DBD1, and TBX20. YGMDV had similar motifs to ZNF410, ZNF410 DBD, and GMEB2 DBD3, indicating the motifs of IGH may promote tumorigenesis(33, 34). In our pre-B-ALL samples, the biased V-J gene usage and the presence of a shared antigen-binding motif in the CDR3 region of tumor B cells demonstrated a significant antigen-driven process. Precursor B-cell receptor (pre-BCR) signaling and spleen tyrosine kinase (SYK) have recently been introduced as therapeutic targets for pre-B-ALL (35). These results suggested that the influence of antigen-stimulated self or foreign antigens on BCR may play a key role in disease progression and the initiation of B cell malignant transformation.
In summary, we for the first time performed immune repertoire sequencing on bone marrow samples from children with pre-B-ALL with multiple recurrences, followed by AlphaFold2 structural similarity analyses of IGH CDR3 coding region. The IGH rod-like tracer was then extracted from mass quantitative immune repertoire sequencing data with great predictive values as a novel biomarker for the dynamic monitoring of MRD in children with pre-B-ALL. The high-accuracy protein structural predictions by AlphaFold2 will greatly facilitate further clinical interpretations of the mass NGS data, and AlphaFold2 itself will also be a powerful tool to identify robust biomarkers in clinical diagnosis and monitoring in future studies.