The present study comprised of a single case of an Iranian aborted male fetus from a 26-years old mother. The parents were first-degree cousins. The family history for genetic or metabolic history was normal. The case was presented after 18 weeks and 2 days of gestation with absent fetal movements and multiple abnormal ultra-sonographic signs.
The disease was first clinically diagnosed by significantly small bell-shaped thorax with a protuberant abdomen, significant shortening of all limb segments despite moderately normal hands and feet, and flat midface with a small nose and anteverted nares (Figure 1a). The disease was then diagnosed with FBCG1 by Radiography. Radiographically, the ribs were typically short and wide and had metaphyseal cupping at both ends. The long bones were short and had broad metaphyseal ends, which gave them a dumb-bell shape (Figure 1b).
The genomic DNA was extracted from fetal tissue using the Omega EZNA® Tissue DNA Kit (Omega Bio-Tek, USA) according to the manufacturer’s protocols. NanoDrop spectrophotometer was used to evaluate DNA concentration (85 ng/ml). The WES was performed by Macrogen Europe (Amsterdam, The Netherlands). Target Enrichment System (Agilent, Human All Exon Kits SureSelect V7; Agilent Technologies, Inc., Santa Clara, CA, USA), followed by a paired-end high-throughput sequencing on reads of 151 bp using Illumina NovaSeq 6000 (Illumine Inc., San Diego, CA, USA). Overall, 75,309 single nucleotide polymorphisms (SNPs) and 10,068 indel variations were detected after removing low quality and out of the boundary of the capture kit in the exome analysis out of 633,000 variants (Table 1).
Obtained data from WES were first mapped to the Homo Sapiens genome reference (UCSC hg19), with mapping efficiencies of 99.9% for each paired-end read using the Burrow-Wheeler Alignment Tool (BWA 0.7.15) (7).
Variant calling of indels and SNPs was carried out by using the GenAP pipeline (Cimorgh Medical IT Solutions, Tehran, Iran) employing the Genome Analysis Toolkit (version v18.104.22.168) HaplotypeCaller pipeline, and the Picard tool (8). For functional annotation and genetic filtering, the GenAP automated variant annotation, classification, and prioritization were used (9). Furthermore, all the recognized recurrent mutations were confirmed using the Integrative Genomics Viewer (IGV) (version 2.8.13) (10). Following the previous step, to remove synonymous and non-exonic variants, common SNPs with minor allele frequency less than 0.02, which were reported in the single nucleotide polymorphism database (dbSNP), the GenAP in house, the 1K human genome, and the genomAD, and exAC databases were filtered out (11). Ultimately, the filtered variants were sorted based on the Combined Annotation Dependent Depletion-PHRED score (Cut-off=15) and zygosity. Two homozygous variants were found in COL11A1. Only the mutation in the COL11A1, which causes FBCG1, was compatible with the clinical findings of the proband. The genomic information regarding this mutation are as follows: transcript: NM_080629.2, nucleotide change: c.3440G>A, amino acid change: p.G1147E, and chromosome position: g.103404625.
For further genetic evaluation, the rare variants with high confidence then were contemplated as disease-causing candidates. To evaluate the pathogenicity of the novel variants, several computational algorithms including Polyphen-2, MutationTaster, and SIFT were recruited (12). Variants occurring in the well known phenotype-causing genes along with the candidate genes were selected and evaluated with priority based on known physiological, biological, and/or functional connection to the phenotype. The interpretation and elucidation of variants were conducted based on the American College of Medical Genetics and Genomics (ACMG) guidelines (13). Amongst the prioritized variants, the mutations which were predicted to be damaging and disease-causing were accounted as the most promising candidates. The results indicate that there is a missense variant from G to A in exon 45 of 68 in the COL11A1 gene NM_080629.2: c.3440G>A, [p.G1147E, g.103404625]), which is found to be disease-causing. This variant is highly conserved in both nucleotide and protein levels suggesting that it has an important function. This variant is not reported in gnomAD, ExAC, or 1000genome, and therefore can be considered as a rare variant.
Following the WES analysis, Sanger sequencing was performed to confirm the candidate variants found in WES as well as segregation analysis of the candidate variants within the family. The results illustrated that there is a G to A alteration at position g.103404625 c.3440. The obtained data shows that the proband carried the A/A genotype (homozygous), while the unaffected parents were found to be heterozygous for this mutation.
Finally, in-silico analyses were performed to predict the effect of this SNP on the protein structure of COL11A1. The amino acid sequence of COL11A1 in humans with NP-542196 was obtained from the NCBI server (https://www.ncbi.nlm.nih.gov). The 3D mature peptide of COL11A1 structure was built by using the iterative threading assembly refinement (I-TASSER) server, as its structure was not available on the protein data bank (14). Each model produced by the I-TASSER is given a confidence score (c-score). Higher values of the c-score indicate higher quality of the 3-D structure and higher confidence level of the predicted structure. Five models were predicted by the I-TASSER. The best-predicted model was model 1 with the highest c-score (1.28). The analysis of the predicted 3D model of proteins was validated using the VADAR web server (15). The Ramachandran results analysis of proteins by VADAR are as follows: Residue in phi-psi core: 70%, Residue in phi-psi allowed: 24%, Residue in phi-psi generous: 2%, and Residue in phi-psi outside: 2%. The results indicate that the selected model is applicable for mutagenesis studies. Bioinformatics predictions have reported a G1147E (G624E in predicted structure) mutation in the protein structure of COL11A1. UCSF Chimera software was then used to generate the structure and identifying the energy minimization of models. PyMOL software was recruited for visualization of structures (Figure 2a), superimposing the mutant model with the native one (Figure 2b), and evaluating the surface electrostatic potential of the native and mutated protein . Assessment of the native COL11A1 and the mutant COL11A1 displayed that the G624E mutation caused electrostatic alteration from a partial positive charge in native to the negative charge in mutant COL11A1 (Figure 2c and 2d).