Patients and clinical assessments
Four cases with sporadic spastic paraplegias were recruited. The clinical assessments were approved by the Expert Committee (equal to the Institutional Review Board) of the Tangdu Hospital of Fourth Military Medical University (China), and we have obtained the written informed consents from all the patients and their family members. The patients and their relatives were all Chinese.
Except positive family history and prominent cognitive impairment in complaints, the characteristics in our patients were consistent with the clinical and radiological criteria for the complicated form of HSP-TCC reported in Japan[8], and Italy[9]. The diagnosis was determined by at least three experienced neurologists and radiologists. Because of negative family history, all known possible causes of spastic paraplegia were carefully excluded. All the patients were performed by Spastic Paraplegia Rating Scale (SPRS) for spastic paraplegia assessment[10], brain and spinal cord MRI scan, and electromyography (EMG) (including myoelectricity, nerve conduction velocity).
Next Generation Sequencing
The blood samples were collected for genetic analysis from 4 patients and all their relatives with informed consent. Extraction of genomic DNA from the peripheral blood leukocytes was obtained by using a standard protocol. Genomic DNA was isolated from peripheral leukocytes, fragmented into 150–200 bp length with the use of sonication. The DNA fragments were then processed by end-repairing, A-tailing and adaptor ligation, a 4-cycle pre-capture PCR amplification, and enriched by a custom-designed panel capturing the coding exons of 39 genes associated with spastic paraplegia, including SPG11. Paired-end sequencing (150 bp) was performed on Illumina HiSeq X-ten platform to provide a mean sequence coverage of more than 100×, with more than 95% of the target bases having at least 20 × coverage.
Raw data was processed by the Illumina pipeline (version 1.3.4) for image analysis, error estimation, base calling and generating the primary sequence data. For the quality control, the Cutadapt (https://pypi.python.org/pypi/cutadapt) and FastQC (www.bioinformatics.babraham.ac.uk/projects/fastqc/) were used to remove 3′-/5′- adapters and low quality reads, respectively. The clean reads were mapped to the human reference genome (UCSC hg19) with the use of the BWA (version 0.7.10, http://bio-bwa.sourceforge.net)[11], duplicate sequence reads were removed by Picard (version 1.85; http://picard.sourceforge.net), and GATK (version 3.1, https://software.broadinstitute.org/gatk/)[12], was used to detect variants. Variants were annotated by ANNOVAR software (version 2015Dec14, http://www.openbioinformatics.org/annovar/)[13], which including function implication (gene region, functional effect, mRNA GenBank accession number, amino acid change, cytoband, etc.) and allele frequency in dbSNP138, 1000 Genomes (Phase3 - Variant Frequencies 5b)and ExAc (exac.broadinstitute.org/)[14], referring to transcriptNM_025137. Damaging missense mutations were predicted by SIFT (sift.bii.a-star.edu.sg/) and PolyPhen-2 (genetics.bwh.harvard.edu/pph2/). Interpretation of the variants according to the American College of Medical Genetics and Genomics (ACMG) recommended standards[15], and all the variants will be categorized into pathogenic, likely pathogenic, uncertain significance (VUS), likely benign and benign.
Sanger sequencing was performed to validate the putative pathogenic variants, allowing segregation analyses where possible. Genetic information of healthy Chinese obtained from local Chinese Millionome Database (CMDB) was identified as healthy controls.
Structural And Functional Analysis
SPG11 protein sequence was obtained from the uniprot database (https://www.uniprot.org/uniprot/Q96JI7). Conserved domain database (CDD) version 3.18[16] was used to detect conserved structure domain in SPG11 via RPS-blast with position-specific score matries (PSSMs), Expect Value threshold was set to 0.01.
Polyphen2 was used to predict possible structural or functional impact of amino acid substitution detected in human proteins using physical and evolutionary comparative algorithm, default setting was used. The prediction was against a precomputed database comprising ~ 150 million missense SNPs detected in all exons of UCSC human genome(hg19)[17].
The orthologous genes of SPG11 were detected via blast, and the evolution tree was drawn by the gene orthology/paralogy predition method implemented in the ensemble database (http://asia.ensembl.org). Multiple sequence alignment was made using muscle 3.8[18], Green bars shows areas of conserved peptides in the sequence, white areas are gaps in the alignment. The multiple alignment sequence used to draw the seqlogo figure using the weblogo software[19] (http://weblogo.berkeley.edu/). The sequence logo consists of stacks of symbols for corresponding amino acids. The height of the stack indicates the conservation of amino acids. Chemical properties of amino acids were used to define the color system: polar amino acids (G,S,T,Y,C,Q,N) are green, basic (K,R,H) blue, acidic (D,E) red and hydrophobic (A,V,L,I,P,W,F,M) amino acids are black.
De novo 3D structure modeling was performed using the I-TASSER algorithm[20] for both wild and mutant SPG11, It identified first 10 possible structure templates using a meta-server threading approach LOMETS[21] based on the highest significance Z-score of the threading alignments, and used SPICKER program to select the final simulation model based on pair-wise structure similarity using RMSD (TM-score). The confidence of the model is quantitatively measured by C-score. The first model with the best C-score is selected for further analysis.
Simulated protein 3D structures of wild and mutant SPG11 were aligned using superpose version 1.0[22]. Protein super positions were calculated using a quaternion approach. Rasmol version 2.7[23] was used to visualize the wild and mutant structures. Relative position of mutation site was determined using structure alignments.
Neuropsychological Evaluation
All the patients were performed by Neuropsychiatric Inventory (NPI), MMSE and MoCA for neuropsychological assessments. The NPI and MMSE were administered at the beginning, followed by MoCA on 7th hospital day to avoid effects of habituation. A cutoff of ≥ 27 on the MMSE was chosen to indicate normal cognitive function and the accepted cutoff of < 26 on the MoCA was taken to indicate cognitive impairment[24, 25]. A cutoff of ≥ 26 on the ADL was chosen to indicate functional disability.
MoCA is a 30-point test administered in 10 minconsisting of seven subtests. Visuoexecutive functions are assessed using a clock-drawing task (3 points), a three-dimensional cube copy (1point) and the Trail Making B task (1 point). Naming is assessed using a three-item confrontation naming task with low-familiarity animals(lion, camel, rhinoceros; 3 points). Attention is evaluated using a sustained attention task (target detection using tapping; 1 point), a serial subtraction task (3 points) and digits forward and backward (1 point each). Language is assessed using repetition of two syntactically complex sentences (2 points) and a phonemic fluency task (1 point). Abstraction is assessed using a two-item verbal abstraction task (2 points).The short-term memory recall task (5 points) involves two learning trials of five nouns and delayed recall after approximately 5 min. Finally, orientation to time and place is evaluated (6 points)[25].