Five Novel Deleterious Variants in FANCA, FANCF and FANCG Identied in Pakistani Fanconi Anemia Families Using Exome Sequencing

Background Fanconi anemia (FA), a cancer-prone inherited bone marrow failure syndrome associated with characteristic dysmorphology is primarily caused by autosomal recessive inheritance of pathogenic germline variants in any of 22 different DNA repair genes. Pathogenic variants in FANCA are the most frequent cause, followed by FANCC and FANCG. There are limited data on the specic molecular causes of FA in different ethnic groups. Methods We evaluated 19 patients with FA undergoing hematopoietic cell transplantation evaluation, from 17 families in Pakistan with exome sequencing and copy number variant analysis. To accompany these efforts, we reviewed the literature and curated a list of variants reported in patients with FA from South Asia and the Middle East. Results The genetic causes for disease were identied in 14 families: 7 FANCA, 2 FANCC, 1 FANCF, 2 FANCG, and 2 FANCL. Homozygous and compound heterozygous variants were present in 12 and 2 families, respectively. Nine families carried variants previously reported as pathogenic, including two families with the South Asian FANCL founder variant. We also identied ve novel likely deleterious variants in FANCA, FANCF, and FANCG in affected patients. Conclusions Our study supports the importance of determining the genomic landscape of FA in diverse populations, in order to improve understanding of FA etiology and assist in the counseling of families. information was available. He underwent successful HLA-matched sibling HCT. A homozygous FANCA frameshift variant in exon 1 (c.37dupC, p.Gln13Profs*24, NC_000016.9:g.89882986dupG) was identied by WES. One unaffected sibling was wild-type and the other was a carrier, but parental DNA was not available for analysis. Patient 9-FA presented with neutropenia which progressed to severe BMF and was diagnosed with FA at 10 years of age. WES revealed and targeted whole gene sequencing validated a large homozygous deletion of exons 7–14 (NC_000016.9:g.89856782_89874222del). Her unaffected sister was a heterozygous carrier. This specic deletion has not been previously reported, but similar large deletions in FANCA have been reported (62).


Background
Fanconi anemia (FA [MIM:227650]) is a cancer-prone inherited bone marrow failure syndrome associated with radial ray abnormalities, characteristic facies, and other medical problems (1,2). Approximately 5% of patients with FA have the VACTERL-H phenotype (Vertebral anomalies, Anal atresia, Cardiac anomalies, Tracheoesophageal stula, Esophageal atresia, Renal structural anomalies, Limb anomalies [primarily radii and/or thumbs], and Hydrocephalus) (3). Additional FA phenotypic ndings are associated with Pigmentation of the skin, small Head, small Eyes, central Nervous system anomalies (excluding hydrocephalus), Otologic anomalies, and Short stature (PHENOS) (4). Patients with FA have exceedingly high risks of head and neck squamous cell carcinoma (HNSCC) and leukemia compared with the general population (5). FA-associated bone marrow failure (BMF) frequently requires hematopoietic cell transplantation (HCT).
There have been a limited number of reports on the genetic etiology of FA in populations from South Asia and the Middle East. Such studies have identi ed several novel disease-causing germline genetic variants, including the rst reports of FA caused by pathogenic variants in FANCO/RAD51C or FANCE , and highlight the importance of germline genetic studies of FA in underrepresented regions. In this report, we evaluated the genetic causes of FA in 19 patients from 17 unrelated families being considered for HCT in Pakistan.

Study subjects
This project was approved by the ethical review committee of the Institute of Biomedical and Genetic Engineering (IBGE, Islamabad, Pakistan). Individuals with FA and their rst-degree relatives were evaluated by their referring physicians.
Copy number variations (CNV) were detected using VarSeq™ v2.1 (VS-CNV), which analyzes changes in WES coverage between the sample and controls (54), and the detected CNVs were visually evaluated using GenomeBrowse® (Golden Helix, Inc., Bozeman, MT) (55,56). Homozygous or heterozygous deletions were ltered based on p-values (< 0.001) and annotated using ClinVar (48). Z-scores and ratio values were assessed when validating genotypes. Suspected CNV events underwent validation using targeted whole gene sequencing as described (57). DNA from families 3-FA, 8-FA, 9-FA, 12-FA, 14-FA, and 16-FA was also sequenced using a targeted custom capture design (Roche, Inc) for next generation sequencing (NGS) designed to include all known FA genes and FA candidate genes, including all intronic regions and 5 kb upstream and downstream of each gene, as previously described (57). Sequence reads were aligned to human genome build 19 (GRCh37) using Burrows-Wheeler Alignment tool and variants were called using HaplotypeCaller and the GATK best practices pipeline for germline variants (58)(59)(60). SNVs and indels were annotated using ANNOVAR (43) CNVs were detected from the targeted NGS reads and annotated using Nexus Copy Number version 10.0 (BioDiscovery, Inc.).
Additionally, DNA from family 17-FA was sequenced using PacBio® long-range sequencing technology with custom IDT xGen® Lockdown® Probes designed to capture all intronic and exonic regions of FANCA. The manufacturer's PacBio® protocols were followed for shearing genomic DNA, end repair, ligation of linear barcoded adapters, ampli cation, sample pooling, and capturing using IDT xGen® Lockdown® Probes. Libraries were prepared using SMRTbell® protocol for primer annealing, polymerase binding, and sequencing on the Sequel system. Circular consensus reads were generated using default parameters (3 passes, 0.99 accuracy) and demultiplexed according to parameters for symmetrical barcodes. Sequence reads were aligned to human genome build 19 (GRCh37) and structural variants were called using pbsv default parameters.
Variant Curation: FA in South Asia and the Middle East A comprehensive literature review was performed using the The National Library of Medicine's PubMed database using the following search terms combined with countries in South Asia and the Middle East, (e.g. "Fanconi anemia and Pakistan" or "Fanconi and India") to curate previously published FA gene variants reported in patients from the following regions: Afghanistan, Bangladesh, Egypt, India, Iran, Iraq, Israel, Jordan, Lebanon, Nepal, Oman, Pakistan, Saudi Arabia, Syria, Turkey, and Yemen. Large cohort studies and case reports which only consisted of phenotypic data and did not report patients' speci c genotypes were excluded. Studies which only reported FA subtypes by complementation testing were also excluded unless further sequencing efforts revealed the speci c variant(s) in the patient(s).

Patient characteristics
There were 19 patients with FA (16 males and 3 females) from 17 unrelated families evaluated in this study (Fig. 1). The majority of families (12/17, 70%) were from Northern or Central Punjab. Other families were from Southern Punjab, Islamabad, Khyber Pakhtunkhwa, and Azad Kashmir. The median age at FA diagnosis was 7 years (range 4-12) and while all were evaluated for HCT, only 5 patients underwent matched sibling HCT. Eight of the 19 patients were deceased at the time of this study. The median age at death was 8.5 years (range 4-13). Pathogenic variants relevant to FA were identi ed in 14 families with FANCA being the most common (7/14, 50%). Homozygous variants in FA-associated genes were identi ed in 12 of the 14 solved families (86%) and 2 probands had compound heterozygous variants. Physical and genetic ndings are listed in Tables 1 and 2, respectively.  FANCA Patient 1-FA presented at seven years of age with aplastic anemia which progressed to severe BMF. He had ectopic kidneys, but no other phenotypic features were reported. WES revealed a homozygous in-frame deletion in exon 38 of FANCA (c.3788_3790delTCT, p.Phe1263del, NC_000016.9:g.89807250_89807252delAGA, rs397507553, ClinVar:41003). Both parents were unaffected carriers and his sibling was wild-type. The c.3788_3790delTCT variant is the most frequently reported FANCA variant and has been observed in multiple populations throughout the world, including FA patients from Pakistan (12,19,61), with a particularly high prevalence in Spain and Brazil (12,13). Patient 3-FA had an abnormal thumb (Fig. 3A) and café au lait spots noted at birth. He also had short stature and low gonadotrophin hormone levels. FA was diagnosed by chromosome breakage on primary lymphocytes after he presented with neutropenia that progressed to severe BMF at 5 years of age. He underwent successful HLA-matched sibling donor HCT at the age of 6 years. We identi ed two deletions in FANCA (NC_000016.9:g.89871674_89880557del, and NC_000016.9:89861527_89863726del) affecting exons 4-7 and 11, respectively. These two deletions have been previously reported in Indian FA patient (14). The exon 11 deletion was paternally inherited, while the deletion of exons 4-7 was maternally inherited. Validation by targeted sequencing methods determined that one unaffected sibling did not carry either deletion. Another unaffected sibling was predicted to be a carrier of the exon 4-7 deletion by VS-CNV but there was insu cient DNA for sequencing validation.
Patient 4-FA presented with severe bone marrow failure at the age of 7 years. Hemophagocytosis was reported on his bone marrow biopsy but no other phenotypic information was available. He underwent successful HLA-matched sibling HCT. A homozygous FANCA frameshift variant in exon 1 (c.37dupC, p.Gln13Profs*24, NC_000016.9:g.89882986dupG) was identi ed by WES. One unaffected sibling was wild-type and the other was a carrier, but parental DNA was not available for analysis. Patient 9-FA presented with neutropenia which progressed to severe BMF and was diagnosed with FA at 10 years of age. WES revealed and targeted whole gene sequencing validated a large homozygous deletion of exons 7-14 (NC_000016.9:g.89856782_89874222del). Her unaffected sister was a heterozygous carrier. This speci c deletion has not been previously reported, but similar large deletions in FANCA have been reported (62).
Affected brothers 17_01-FA and 17_02-FA both presented with abnormal thumbs at birth (Fig. 3B). Small ear canals were also noted in 17_01-FA (Fig. 3C). At the ages of 6 and 10 years, respectively, they presented with severe BMF and immunode ciency leading to an FA diagnosis. Biallelic variants in FANCA were identi ed by various NGS methods in both siblings (c.2749C > T, p.Arg917*, NC_000016.9:g.89831327G > A and NC_000016.9:g.89847600-89853759del). WES revealed a maternally-inherited nonsense variant in exon 28 which has been previously identi ed in an Indian patient with FA and other populations (rs1060501880, ClinVar:408188) (14,61). A large deletion of exons 15-17 (NC_000016.9:g.89847600-89853759del) was detected by targeted PacBio® long-range sequencing in both affected siblings and has been previously reported in other patients with FA (17). This deletion was not detected in DNA from father's peripheral blood, but relatedness analyses con rmed paternity with large regions of homozygosity being consistent with offspring from a consanguineous relationship between third-degree relatives. Additionally, analyses of single nucleotide polymorphisms (SNP) in the FANCA locus provided evidence for a possible a genotypic reversion in the paternal hematopoietic stem cells or paternal inheritance as a result of gonadal mosaicism. Both such occurrences have been previously reported in patients with FA (6, 63-65).
Patient 19-FA presented at 8 years of age with neutropenia that progressed to severe BMF by age 9 years. A homozygous FANCA missense variant in exon 41 (c.4070C > A, p.Ala1357Asp, NC_000016.9:g.89805638G > T) was identi ed by WES. Her unaffected brother is a heterozygous carrier, but parental DNA was not available. FANCA p.Ala1357Asp is not present in gnomAD and is predicted deleterious by in silico tools (MetaSVM score = 0.915, REVEL = 0.702, CADD phred = 24.1).
Patient 5-FA presented with moderate aplastic anemia and progressed to severe BMF at the age of 11 years. He also had short stature and abnormal left leg growth. He had two brothers and one sister who died due to similar complications but without a diagnosis. His two surviving unaffected siblings and parents are all heterozygous carriers. Patient 8-FA presented with moderate aplastic anemia and progressed to severe BMF by 4 years of age. Skin hyperpigmentation, bone deformities including the absence of metacarpals, thumbs, and radii, and a high arched palate were also reported. He died at the age of 4 years due to a brain hemorrhage before HLA-matched sibling HCT could be performed. One unaffected sibling is a carrier, but parental DNA was not available. FANCF Patient 10-FA was homozygous for nonsense variant in FANCF (c.785T > G, p.Leu262*, NC_000011.9:g.22646572A > C, rs368067979). He was diagnosed with FA at 6 years of age when aplastic anemia progressed to severe BMF. He also had polydactyly and died from a hemorrhagic stroke shortly after his FA diagnosis. Parental DNA was not available and the sibling available for testing was not a carrier. FANCG Patient 18-FA had an extra digit, a small right hand, short stature, and ectopic kidneys. He was diagnosed with FA at 5 years of age and underwent HCT from his HLA-matched sister for severe BMF and is doing well. A homozygous nonsense variant in exon 6 of FANCG was identi ed by WES (c.710C > G, p.Ser237*, NC_000009.11:g.35077035G > C). His sibling is wild-type at this locus. The only parent who was available for testing was heterozygous for this loss of function variant.
Proband 21_01-FA and his brother 21_02-FA were diagnosed with FA at the ages of 8 and 5 years, respectively. 21_01-FA had pancytopenia that progressed rapidly following his diagnosis with FA and he died due to a brain hemorrhage. Currently, 21_02-FA does not have cytopenias. The bone marrow of both FA-affected brothers was reported to have numerous rosettes with central eosinophilic material surrounded by small cells seen in a background of brosis. The affected brothers have a homozygous frameshift variant in exon 11 of FANCG (c.1471_1473delAAAinsG, p.Lys491Glyfs*9, NC_000009.11:g.35075283_35075285delTTTinsC, rs1018027137). One of their unaffected siblings was heterozygous for this variant. This variant has been previously reported in a heterozygous patient with FA (17). A pairwise comparison between cases 6-FA and 7-FA was performed to assess potential relationships. A genotype comparison on approximately 7300 common SNPs between the probands and their siblings showed no indication of relatedness between families 6-FA and 7-FA. Parental sequencing data was not available. Patient 6-FA had an extra thumb and areas of skin hyperpigmentation. He presented with aplastic anemia that progressed to severe BMF at the age of 9 years. Although he was treated with androgens while awaiting an HLA sibling matched HCT, he died Discussion Identi cation of the genetic causes of rare diseases such as FA is important to verify diagnoses, improve clinical management, allow for appropriate genetic counseling, and understand the underlying pathobiology of these disorders. The genetic cause of FA was identi ed in 16 patients from 14 families in this study; three families remain molecularly undiagnosed. FANCA was the most commonly affected gene with pathogenic FANCA variants present in 50% of the families. The other pathogenic variants were present in FANCC, FANCG, or FANCL (2 families each), and FANCF (1 family). Only 6 of the 14 variants identi ed in our study were present in any gnomAD population and 3 of these were solely in South Asian populations (Table 2).
Homozygosity for pathogenic variants was present in 12 of the 14 families. Only two families had compound heterozygous inheritance (both in FANCA). Two families were homozygous for the FANCL founder variant (18). We were unable to evaluate consanguinity in 14 of the families in this study due to lack of parental DNA samples. However, the presence of homozygous pathogenic variants in our data is consistent with prior studies reporting an approximately 70% rate of consanguineous marriages in Pakistan (66)(67)(68). Nine of the 14 pathogenic variants identi ed have been previously reported (Table 2) (12,(14)(15)(16)(17)(18).
Our study was limited only to patients with FA who also had BMF severe enough to warrant HCT evaluation. Nevertheless, this investigation sheds further light onto the type and frequencies of germline pathogenic variants associated with FA in the Pakistani population. The patients included in this study are likely a very small minority of FA cases in Pakistan. It is possible at 10 years of age due to an unspeci ed hemorrhage. Patient 7-FA was diagnosed with FA after presenting with severe BMF at the age of 7 years old. He died shortly after his diagnosis at the age of 8 years due to an unreported cause.

Gene Unknown Families
Rare heterozygous variants in FA pathway genes identi ed in probands 12-FA, 14-FA, and 16-FA are reported in Supplementary  Table 2. These individuals had chromosome breakage testing consistent with FA. Proband 12-FA had no dysmorphology but was diagnosed with FA at the age of 11 years after presenting with BMF and underwent a successful HLA-matched sibling HCT. No rare deleterious variants were identi ed in the 22 FA-associated genes.
FA was diagnosed in proband 14-FA at 12 year of age and he died 1 year after diagnosis due to an unreported cause. 14-FA was a heterozygous carrier for a variant of uncertain signi cance (VUS) in FANCN and a likely benign FANCO variant.
Bilateral thumb malformations were noted at birth in proband 16-FA. She was diagnosed with aplastic anemia at age 9. She underwent a successful HLA-matched sibling HCT. Heterozygous VUS were present in FANCA, FANCD2, FANCI, and FANCP. The FANCP variant (c.2209C > T, p.Arg737Cys, NC_000016.9:g.3642818G > A, rs140706384) may be deleterious as it has a REVEL score of 0.449 and CADD score of 26.2, but the MetaSVM score was predicted as tolerated; additional functional studies are required to determine potential pathogenicity. There were no other deleterious variants or large CNV events detected in FANCP, so this patient remains gene unknown. The majority of reported variants occurred in FANCA and were private to their respective populations. All large deletions, SNPs, and small insertion/deletion variants reported in patients with FA in these regions can be found in Supplementary Tables 3, 4, and 5, respectively. The only large deletions reported were in FANCA, similar to our ndings and consistent with others. We also identi ed the recently identi ed FANCL founder mutation (c.1092G > A, p.Trp341_Lys364del) in 2 families from Pakistan (18). that in the absence of overt dysmorphology, the diagnosis of FA may be delayed or missed because chronic malnutrition and its complications may confound rare disease diagnoses (66).

Conclusion
The majority of studies performed on the genetics of FA in South Asia and the Middle East included targeted sequencing efforts for FANCA with FANCC, FANCG, and FANCE studied less frequently, but are more common than other subtypes. FANCA, FANCC, and FANCG account for upwards of 85% of FA cases in those of European descent, but it is imperative to understand the genetic landscape of FA across all 22 subtypes when assessing FA in diverse populations as the prevalence of subtypes may vary in South Asia and the Middle East from those of European ancestry. While large cohort studies are not common in these regions for identifying speci c genotypes, there have been several studies published describing the phenotypes of pediatric patients presenting with aplastic anemia, MDS, and/or AML in combination with other FA clinical features (i.e. VACTERL-H and PHENOS) and/or positive chromosome breakage testing. Without performing genetic testing on suspected FA patients, the genetic heterogeneity of FA in these populations could remain under-reported. Large genotype-phenotype studies of patients with FA around the world are required to better understand the genetic variation in diverse populations in order to uncover disease etiology, improve diagnostics and patient management, as well as provide genetic counseling.