Impact of Genetic Ancestry on T-cell Acute Lymphoblastic Leukemia Outcomes

Abstract The influence of genetic ancestry on biology, survival outcomes, and risk stratification in T-cell Acute Lymphoblastic Leukemia (T-ALL) has not been explored. Genetic ancestry was genomically-derived from DNA-based single nucleotide polymorphisms in children and young adults with T-ALL treated on Children’s Oncology Group trial AALL0434. We determined associations of genetic ancestry, leukemia genomics and survival outcomes; co-primary outcomes were genomic subtype, pathway alteration, overall survival (OS), and event-free survival (EFS). Among 1309 patients, T-ALL molecular subtypes varied significantly by genetic ancestry, including increased frequency of genomically defined ETP-like, MLLT10, and BCL11B-activated subtypes in patients of African ancestry. In multivariable Cox models adjusting for high-risk subtype and pathways, patients of Admixed American ancestry had superior 5-year EFS/OS compared with European; EFS/OS for patients of African and European ancestry were similar. The prognostic value of five commonly altered T-ALL genes varied by ancestry – including NOTCH1 , which was associated with superior OS for patients of European and Admixed American ancestry but non-prognostic among patients of African ancestry. Furthermore, a published five-gene risk classifier accurately risk stratified patients of European ancestry, but misclassified patients of African ancestry. We developed a penalized Cox model which successfully risk stratified patients across ancestries. Overall, 80% of patients had a genomic alteration in at least one gene with differential prognostic impact by genetic ancestry. T-ALL genomics and prognostic associations of genomic alterations vary by genetic ancestry. These data demonstrate the importance of incorporating genetic ancestry into analyses of tumor biology for risk classification algorithms.


Introduction
Despite improvements in survival for many pediatric cancers, disparities persist such that children of Black race and Hispanic ethnicity continue to have inferior outcomes as compared to non-Hispanic White children. 1,2Race and ethnicity are social constructs associated with exposure to social determinants of health (SDOH) such as poverty, structural racism, 3 and unmet basic resource needs-in part explaining disparate outcomes. 4Although there is signi cant heterogeneity within socially constructed racial and ethnic groups, self-identi ed race and ethnicity may correlate with an individual's ancestral origins. 5,6Genetic ancestry can be de ned by stretches of the genome inherited from familial predecessors and determined by comparing hundreds of thousands of single nucleotide polymorphisms (SNPs) with frequency of SNPs from global reference populations. 5,7,8Abundant literature demonstrates associations between genetic ancestry and cancer biology. 5,9,10In pediatric B-cell acute lymphoblastic leukemia (B-ALL), for example, individuals with Admixed American ancestry who frequently identify as having Hispanic ethnicity are more likely to harbor high risk CRLF2 rearranged Ph-like subtype due to higher prevalence of the germline GATA3 risk allele. 11T-cell ALL (T-ALL) is more biologically heterogeneous in terms of activating drivers and mutational landscape as compared with B-ALL. 12Whether tumor biology and its prognostic relevance are impacted by genetic ancestry in T-ALL has not been previously explored.
Increasingly, cancer treatment utilizes differences in disease biology to risk stratify patients.Tumor biology may allocate patients to novel or intensi ed therapy for patients at high risk of relapse, and de-escalate treatment for lower-risk patients. 13To date, genomic risk strati cation in T-ALL has been rudimentary, and whether recently identi ed risk biomarkers have uniform prognostic value for all patients remains unclear.Identifying the impact of genetic ancestry on biology, survival outcomes, and prognostic utility of previously identi ed biomarkers is imperative for optimal and equitable risk allocation and treatment.
The largest cohort of T-ALL comprehensively sequenced to date-over 1300 children, adolescents, and young adults (CAYA)-recently identi ed 15 unique T-ALL subtypes with nearly 60% of clonal leukemic drivers occurring due to alterations in non-coding regions. 14ing this multiomic AALL0434 dataset, we examined the in uence of genetic ancestry on T-ALL genomics and survival outcomes and evaluated whether the prognostic utility of speci c biomarkers varied by genetic ancestry.

Results
The analytic cohort included 1309 patients (median 9 years, interquartile range 5-13, range 1-29) with T-ALL treated on the Children's Oncology Group (COG) clinical trial AALL0434.The distribution of categorical genomically-de ned ancestry: 58% European, 15% Admixed American, 11% African, 3% East Asian, 2% South Asian, and 12% Other.Clinical features including central nervous system (CNS) status, diagnostic white blood cell count (WBC), and end induction measurable residual disease (MRD) were similar across ancestries (Table 1).Patients of African and Admixed American ancestry were more likely to have Medicaid-only insurance as compared with patients of European ancestry (49.7%, 46.9%, 20.3%, respectively).Patients of African ancestry were more likely to harbor ETP and near-ETP immunophenotype T-ALL as compared with patients of European ancestry (11.9% vs. 7.4%).

Subtype
Overall, T-ALL genomic subtypes varied across genetic ancestries (P<0.001, Figure 1A).The newly identi ed high-risk ETP-like subtype, as well as MLLT10 and BCL11B-activated subtypes accounted for greater proportion of T-ALL cases among CAYA of African ancestry as compared with European ancestry (ETP-like OR 2.49, 95% CI 1.45-4.28Table 2; 29.3% vs 14.5% Table S1).Within ETP-like subtype, there was also variation in subtype classifying drivers by ancestry (Extended Data Fig 1).Patients of South Asian and East Asian ancestry had notable differences in predominant subtypes compared with European ancestry, but interpretation was limited given small numbers.Continuous ancestry analyses mirrored associations in primary ndings (Extended Data Table 1).There were also differences in driver gene alterations, with TLX3 and TLX1 being more common drivers in European, and MLLT10 and MED12 more common in patients of African ancestry (Figure 1B).

Genomic biomarkers by genetic ancestry NOTCH pathway
NOTCH is the most commonly dysregulated pathway in T-ALL with alterations identi ed in this cohort in NOTCH1 (n=903, 78%), FBXW7 (n=285, 22%), ZMIZ1 (n=7, 1%).Overall, patients with NOTCH pathway alterations experienced signi cantly superior OS/EFS as compared to those without; however, when strati ed by ancestry NOTCH alteration conferred favorable prognosis for patients of European and Admixed American ancestry but not for patients of African ancestry (Figure 3, left panel).Furthermore, we observed differential prognostic value for NOTCH1 and FBXW7 by ancestry (Extended Data Fig 5): NOTCH1 conferred favorable prognosis for patients of European and Admixed American ancestry but not for patients of African ancestry; FBXW7 conferred favorable prognosis for patients of Admixed American ancestry only.In terms of frequency, patients of African ancestry were less likely to harbor alterations in NOTCH pathway overall, with lower frequency of NOTCH1 mutations and similar frequency of FBXW7 mutations as compared with patients of European and Admixed American ancestry (Table S4).
We recently observed that different types of NOTCH1 alterations have differential prognostic impact-intragenic deletion and intronic SNV/indel associated with negative outcomes; indel, SNV, and stop/frameshift/splice mutations associated with favorable outcomes.14 Herein we observed a greater proportion of deleterious NOTCH1 alterations among patients of African ancestry as compared with European (13% vs 6% P= 0.04).Furthermore,NOTCH1 alterations that were favorable in the overall cohort overall and among patients of European ancestry did not confer similarly favorable EFS among patients of African ancestry (Extended Data Fig 2) -in part explaining the non-prognostic value of NOTCH1 alterations in this group.Finally, in a comparison of NOTCH1 and FBXW7 coding mutation type (frameshift, missense, nonsense), we observed a greater proportion of frameshift and smaller proportion of missense mutations in NOTCH1 among patients of African ancestry as compared with European and Admixed American (frameshift 54%, 31%, 33%, respectively) with similar proportions of FBXW7 coding mutations (Extended Data Fig 3, Extended Data Fig 4).

Group for Research on Adult Lymphoblastic Leukemia (GRAALL) risk classi er
Studies by GRAALL cooperative group identi ed a prognostic risk classi er, with mutations in NOTCH1/FBXW7 in the absence of NRAS/KRAS or PTEN mutations portending favorable outcomes, and conversely, absence of NOTCH1/FBXW7 and presence of NRAS/KRAS/PTEN alterations distinguishing patients with poor outcomes. 15,16We applied this gene classi er-NOTCH1/FBXW7 (N/F), NRAS/KRAS/PTEN (R/P)-to our cohort and examined its association with survival strati ed by genetic ancestry.Among patients of European and Admixed American ancestry, the GRAALL classi er successfully differentiated survival outcomes; however, patients of African ancestry were misclassi ed (Figure 3, center panel).Examining all genes in this classi er separately, a difference in prognostic value by ancestry was observed for NOTCH1, PTEN and NRAS/KRAS; for example, NRAS/KRAS alterations were signi cantly deleterious only for individuals of African ancestry (Extended Data Fig 5).
Among altered genes/regions in at least 5% of patients per ancestral group, we further explored prognostic value by genetic ancestry.A difference in prognostic association was observed for 5 of the top 14 most commonly altered genes/regions in T-ALL, including: NOTCH1, PHF6, PTEN, NRAS/KRAS and loss of chromosome 6q.In contrast, there were no differences for CDKN2A, FBXW7, DNM2, LEF1, MYB, MYC, WT1, USP7, IL7R (Figure 4).No single genomic alteration was prognostic across all ancestral groups.

Penalized Cox regression model risk classi er
Our group recently published a novel penalized Cox regression model incorporating clinical variables (MRD, CNS status, WBC), genetic subtype, and speci c genomic alterations to risk stratify patients, with resulting 5-year EFS ranging from 65% (highest risk) to 97% (lowest risk). 14Unlike the GRAALL-classi er, this model-based classi er successfully risk strati ed all patients, with similar EFS ranges for each risk group across ancestries and as compared with the cohort overall (Figure 3, right panel; All patients P< 0.001, European P<0.001, Admixed American P=0.01, African P=0.02).

Discussion
We observed signi cant differences in leukemia biology by genetic ancestry in the largest cohort of patients with T-ALL sequenced to date.The greatest differences in T-ALL subtype and pathway deregulation were observed between patients of African as compared with European ancestry.We also found that the prognostic value of individual genomic alterations-including the Notch pathway-and a previously published ve-gene risk classi er 17 varied by genetic ancestry.Speci cally, in this cohort the ve-gene classi er successfully strati ed patients of European ancestry into high and low-risk groups but failed to accurately risk-stratify patients of African ancestry.In contrast to our prior ndings in B-ALL, 9,18 we found signi cantly superior survival among patients of Admixed American ancestry, and similar survival among patients of African compared to European ancestry.Taken together, these ndings suggest the immediate need to incorporate analysis of genetic ancestry into risk strati cation algorithms on phase three clinical trials.This is the rst study to explore the impact of genetic ancestry in T-ALL incorporating tumor genomics.In pediatric B-ALL, Admixed American ancestry is associated with greater prevalence of CRLF2 rearrangement and African ancestry is associated with greater prevalence of TCF3::PBX1 and less hyperdiploidy. 9In adult cancers, women of African ancestry are more likely to have triple-negative hormone receptor breast cancer as compared with European ancestry, 19 and individuals of Asian ancestry with non-small cell lung cancer are more likely to harbor pathogenic alterations in EGFR. 20There have been two publications in acute myeloid leukemia (AML) suggesting differences in prognostic association of genetic alterations by social race, but without analysis of genetically de ned ancestry.

21,22
Herein we demonstrate not only differences in the frequency of genetic alterations by genetic ancestry in a large pediatric population, but also that the prognostic value of common genetic alterations-including NOTCH1-differ by genetic ancestry.The implication of this nding is that if NOTCH1 were utilized to risk stratify patients, it might correctly risk stratify patients of European ancestry but misclassify patients of African ancestry-a nding highly relevant to clinical trial design and patient care.A similar nding has been reported in adults with solid tumors among whom MGA alterations were associated with superior OS among patients of European ancestry and inferior OS among patients of Asian ancestry. 23To our knowledge, this is the rst report of differential biomarker prognostication between ancestral groups in a hematologic malignancy.
Survival outcomes in pediatric oncology are in uenced by biologic phenomena and SDOH.Most prior literature has focused on racial and ethnic outcome disparities associated with adverse SDOH including structural racism, poverty, and access to quality health care.
Race and ethnicity are social constructs without biologic basis, yet with some association to genetic ancestral origins. 6,24,25In contrast to B-ALL, we observed superior outcomes for CAYA of Admixed American ancestry and similar outcomes for CAYA of African ancestry as compared to those of European ancestry. 9This was not explained by a predominance of low-risk leukemia genomics.There may be complex germline variants, more prevalent among patients of Admixed American or African ancestry with T-ALL, such that chemotherapy metabolism or drug sensitivity overcome impacts of adverse SDOH for these patients, warranting further investigation.Concurrent evaluation of both SDOH and biologic drivers of outcome disparities is essential inform health care delivery interventions and advance equity.
We acknowledge limitations in our study.Prior literature among CAYA with cancer suggests that patients of socially minoritized race and ethnicity are more likely to be treated off study. 26,27Thus, our cohort may not represent the full distribution of all genomic alterations, particularly among patients with greater proportions of non-European ancestry.Although we observed differences in biology and survival patterns among patients of East Asian and South Asian ancestries, we were unable to draw conclusions due to limited sample size, warranting further investigation.Additionally, the penalized cox regression model requires validation in global populations.
Most children in the United States with newly diagnosed cancer are treated on cooperative group clinical trials or with treatment regimens that became standard of care based on preceding trial results.Increasingly, frontline trials rely on prognostic biomarkers for risk strati cation. 28Given that patients of minoritized social race and ethnicity who are more likely to have non-European genetic ancestry already experience disparities in cancer outcomes, attention to the clinical implementation of genomic biomarkers in treatment decision-making is essential to promote health equity.Our results suggest that ensuring equivalent utility of genomic risk classi ers across ancestries is essential for appropriate risk strati cation.Without this critical step, we risk misclassifying patients into high-or low-risk groups, potentially leading to undertreatment and increased risk of relapse, or overtreatment and unnecessary toxicity.Additionally, the validity of statistical analysis in phase three clinical trials relies on appropriate classi cation of children into high-and low-risk groups.Misclassi cation due to differential utility of genomic classi ers by ancestry has the potential to directly impact the interpretation of clinical trial results.These data suggest that risk classi ers should be examined by genetic ancestry to ensure equivalent e cacy before implementation in clinical trials.

Study Population
The participants included in this study were enrolled on the Children's Oncology Group (COG) clinical trial AALL0434 (NCT04408005) conducted from 2007 to 2014. 29CAYA with newly diagnosed T-ALL ages 1 to 31 years old were eligible.All subjects with T-ALL were required to enroll on a companion classi cation study for biobanking and risk strati cation, AALL03B1 (NCT00482352) or AALL08B1 (NCT01142427).These trials were approved by the National Cancer Institute Cancer Therapy Evaluation Program, the Pediatric Central Institutional Review Board (IRB), and participating center IRBs.Written informed consents for trial enrollment, specimen banking, and future research were obtained from caregivers and/or patients at the time of original COG study enrollment.[31] Exposure: genetic ancestry DNA-based genetic ancestry was the primary exposure of interest.Individual genetic ancestral composition was based on comparison of every patient's genotypes and allele frequencies in reference populations (1000 genomes project). 8Genome-wide single nucleotide polymorphisms (SNP) with a minor allele frequency > 1% were randomly selected and the fraction of genome derived from a reference population was estimated using the maximum likelihood method with the sum of coe cients from 5 populations assumed to sum to 100%. 32For every patient, data from the germline SNP genotyping from the In nium Omni2.5Exome BeadChip was used in ancestry estimation.For categorization of patients into categorical ancestral groups, de nitions were consistent with previously published methods: individuals were classi ed by composition of genetic ancestry de ned as African (African > 70%), East Asian (East Asian > 90%), Admixed American (Amerindian > 10% and Amerindian > African), South Asian (South Asian > 70%), European (European > 90%), and patients who did not meet these thresholds de ned as Other. 5,9,33Individuals with ancestry from indigenous populations of North American and/or South American often have a more heterogenous composition of ancestry-speci c SNPs. 34 Thus, these individuals are referred to as having "Admixed American" ancestry.

Outcome: subtype and pathway alteration
Recent integrated (WGS/WES/RNA seq) genomic analysis identi ed 15 unique T-ALL subtypes with distinct genomic drivers and oncogene expression (Table S4). 14The ETP-like subtype is driven by alterations in a set of genes encoding regulators of hematopoietic stem cell development and is immunophenotypically variable. 14Coding and non-coding alterations in T-ALL can also be grouped into 17 distinct aberrant signaling pathways. 14Subtypes, dysregulated pathways, and driver gene alterations were examined for association with genetic ancestry, and as prognostic biomarkers for survival outcomes.

Outcome: survival
Overall survival (OS) was de ned as time from date of enrollment to date of death from any cause or censored at last contact.Event free survival (EFS) was de ned as time from enrollment to rst event (induction failure, induction death, relapse, second malignant neoplasm, or remission death) or date of last contact. 29variates Patient characteristics examined for potential confounding included age, sex, insurance status, central nervous system (CNS) status, diagnostic white blood cell count (WBC), day 29 measurable residual disease (MRD), and trial arm. 29Early T-cell Precursor (ETP) status by immunophenotype (distinct from ETP-like genomic subtype) was also examined as a potential confounder.ETP was de ned by central evaluation of diagnostic samples from 1140 patients utilizing the de nition of ETP T-ALL as CD8-and CD1a-(<5% positive), weak CD5 expression, and expression of one or myeloid and/or stem cell markers (>25%).Near ETP was de ned with this same immunophenotype but stronger CD5 expression. 35

Statistical analysis
Baseline characteristics were summarized by descriptive statistics.Chi-square or Fisher's exact test were conducted for the association of categorical ancestry with subtype, driver genes, and pathway alterations.All regression models considered European ancestry as the reference group.Associations between ancestry and subtype were modeled using a multinomial regression with TAL1 DP-like as the reference group, as it was the most common.Association of ancestry and pathway alterations were modeled using separate logistic regressions for individual pathways.The Holmes test corrected for multiple comparisons.
Association of biologic subtype and genetic ancestry as a continuous variable were assessed with a two-step procedure.First, we assessed whether there was an overall ancestry related difference in T-ALL subtype.We performed an overall likelihood ratio test, a chi-square test comparing a multinomial regression model without any ancestry variable to a model including all 4 ancestries as continuous variables (European ancestry left out as the reference group).If there was an overall association, step two then examined the association of each ancestry with subtype.For continuous ancestry analysis, we present odds ratios for every 25% increase in a non-European ancestry with European as the reference group, and with TAL1-DP as the reference subtype given it was the most common among all ancestral groups. 9Thus, an odds ratio associated with 25% increase in African ancestry refers to the increase or decrease in odds of a given T-ALL subtype expression when African ancestry increases with concurrent decrease in European ancestry and all other ancestries held constant.For assessment of association of continuous genetic ancestry and pathway alterations, a separate logistic regression model was constructed for each individual pathway.The same process as subtype analysis was performed for pathway analysis except a logistic regression model for each individual pathway was constructed.The Holmes test was used to correct for multiple comparisons.OS/EFS were censored at 5-years; few documented events subsequently occurred.Kaplan-Meier curves were plotted by ancestry and compared using log rank tests.Univariable and multivariable Cox proportional hazard regression models were used to estimate hazard ratios (HR).Covariates associated with exposure (P<0.2 or absolute difference of ≥10%) and outcome (P<0.2 or HR ≥1.5 or ≤0.67) were included in the multivariable model; age and sex were included regardless of statistical association.
In the post hoc analysis, we examined prognostic utility of pathway alterations, genetic variants, and two previously reported risk classi ers to evaluate the combined effect of T-ALL biology and genetic ancestry on survival outcomes.Although many studies have proposed genomic classi ers for risk strati cation in T-ALL, few classi ers have been applied across several cohorts.An exception is a ve-gene risk classi er, originally identi ed by Trinquand et al from The Group of Research on Adult Acute Lymphoblastic Leukemia (GRAALL-2003 and GRAALL-2005) 15 and subsequently applied to two European pediatric cohorts (FRALLE2000T, UKALL2003). 17,36his classi ed individuals based on NOTCH1, FBXW7, NRAS/KRAS, and PTEN alterations.Therefore, we selected this classi er to examine utility by genetic ancestry.We also examined a recently published penalized Cox regression model with clinical and genomic variables. 14We then applied these risk classi ers and strati ed by genetic ancestry to evaluate e cacy across different ancestral groups.
Analyses used Stata Be 17 and R, version 4.0.4(R Group for Statistical Computing).

Declarations
Competing interest statement D.T.T. received research funding from BEAM Therapeutics, NeoImmune Tech and serves on advisory boards for BEAM Therapeutics, Janssen, Servier, Sobi, and Jazz.D.T.T. has multiple patents pending on CAR-T.C.G.M. serves on scienti c advisory board and honoraria for Illumina, and received research funding from P zer, equity from Amgen and royalties from Cyrus.E.A.R. received research funding from P zer and serves on a DSMB for BMS.Event free survival by NOTCH pathway alteration and risk classi ers by genetic ancestry.

Figures Figure 1 T
Figures

Figure 2 Overall
Figure 2

Table 1 .
Demographics and Clinical Characteristics by Genetic Ancestry Capizzi methotrexate, Arm B=Capizzi methotrexate + Nelarabine, Arm C=High dose methotrexate, Arm D=High dose methotrexate + nelarabine, Standard induction (not randomized); CNS denotes central nervous system, WBC denotes White Blood Cell, MRD denotes bone marrow minimal residual disease, ETP denotes Early T-cell Precursor

Table 2 .
Association of Genetic Ancestry with Biologic Subtype (TAL1 DP-like as reference subtype) and Pathway Alterations a One multinomial model for subtype; separate logistic regression model for each pathway.Five subtypes and two pathways are not presented because of unstable estimates due to small numbers.NA indicates that model did not converge due to small numbers.
a b All South Asian patients have this pathway alteration.cRemained signi cant after adjustment for multiple comparisons for pathway analysis.