Rare Nonsynonymous Variants in Lipid Metabolism Related Genes in Coronary Artery Disease

Background: CAD (Coronary Artery Disease) is a complex disease that in�uenced by environment and genetic factors. In this study, we aim to investigate the relationship between rare nonsynonymous variants in lipid metabolism related genes and CAD in Chinese Han population. Methods: A total of 252 samples were recruited in this study, including 120 CAD cases and 132 normal health controls. Rare variants were detected via NGS based targeted sequencing. Pathogenicity prediction were performed with SIFT and Polyphen-2. Results: The present study identi�ed 33 nonsynonymous rare variants including two novel variants located in ANGPTL4 (G47E) and SCARB1 (L233F) gene, respectively. Association analysis showed CAD patients carried more nonsynonymous variants in all mutation sets, but did not reach statistically signi�cant. Conclusions: Targeted sequencing was a powerful tool to uncover rare variants in coronary artery disease. Clinical relevance of rare variants in CAD etiology needs to be investigated in future larger sample sizes.


Introduction
Coronary artery disease (CAD) is a common chronic in ammatory disease which also remains the leading cause of death worldwide [1]. It was estimated that 700,000 people died from CAD in China every year [2]. In addition to conventional risk factors such as hypertension, dyslipidemia, diabetes, obesity and smoking, genetic factors also play an important role in CAD pathogenesis. To reduce the occurrence of CAD, it is important to nd the biomarkers that responsible for CAD etiology [3].
Genome-wide association studies have identi ed many variants that associated with CAD [4][5][6][7]. GWAS studies focus on common variants and these susceptibility variants are always located within intronic or intergenic regions with relatively small effect. Rare variants that might also associated with CAD are generally missed.
Rare variants are genetic variations with frequency less than 1% and sometimes much lower [8,9]. Nonsynonymous variants are predicted to change the amino acid sequence of protein, which include missense variants (a single amino acid substitution), nonsense variants (create a premature stop codon) and frameshift variants (alter the reading frame of a protein). Nonsynonymous variants could affect protein function and signi cantly contribute to the etiology of complex diseases [10]. However, those variants are likely to be under strong negative selection and may be missed by whole genome association mapping for identify genes in complex disease [11].
Due to the increase of throughput and decrease in costs, NGS(Next Generation Sequencing) based technology has been wildly used in human disease researches. Extensive research using exome sequencing identi ed rare variants responsible for Mendelian diseases. Targeted sequencing is a rapid and costeffective way to detect known and novel variants in selected sets of genes or genomic regions, and proven to be an e cient technique for screening variants in complex disease [12,13]. It has been shown that targeted sequencing of a subset of genes generates results with identical quality to Sanger sequencing [14].
Lipid disorder is one of the most important risk factors for CAD. In this study, we conduct targeted sequencing of 12 genes that involved in lipoprotein metabolism to investigate the relationship between rare variants and coronary heart disease. We aim to nd nonsynonymous variants that confer susceptibility to CAD in Chinese Han population, therefore, shed light on the exploration of CAD pathogenesis.

Study population
A total of 120 CAD patients and 132 non-CAD controls were recruited from Renji Hospital between 2016 and 2020. Individuals with incomplete information were excluded. All the participants were unrelated Chinese Han individuals. This study was approved by the Medical Ethics Committee of Renji Hospital and compliant with the principles set forth by the Declaration of Helsinki. The diagnostic criteria for CAD cases were de ned as followings: at least one of the major segments of coronary arteries (right coronary artery, left circum ex, or left anterior descending arteries) with more than or equal to 50% organic stenosis based on coronary angiography. All unaffected controls were determined to be free of CAD. 5ml peripheral blood sample was collected from each subject.

Targeted sequencing
Genomic DNA was extracted using TianGen DNA extraction kit (TianGen Ltd, Beijing, China) following standard protocol. DNA concentration and quality was measured using NanoDrop spectrophotometer (Thermo Scienti c, USA). All puri ed DNA were stored at -80℃.50ng DNA was used for PCR ampli cation. PCR primers were designed using Oligo 6.0 and synthesized by Shanghai Free Biotechnology Co., Ltd (Shanghai, China). Coding regions of target gene were captured by multiplex PCR and followed by adaptor adding. The nal panel consisted of 203 amplicons with and average size of 250 bp. Paired-end sequencing (2X150) was performed with Illumina NovaSeq sequencing instruments (Novogene, Beijing, China).
Rare nonsynonymous variants with frequency <1% were selected for further analysis.  Mutation pathogenicity analysis were performed using SIFT and Polyphen-2. 12 variants were predicted to be deleterious by SIFT and 18 were predicted to be possibly damaging or probably damaging by PolyPhen-2. 10 variants were predicted to be tolerated by SIFT and benign by PolyPhen-2 indicating that these variants results in truncated protein but does not imply pathogenic. 8 variants were predicted to be deleterious/damaging in both programs. 22 variants were predicted to be damaging or deleterious in at least one program.  a: annotated as "deleterious" or "damaging" by SIFT b: annotated as "possibly damaging" or "probably damaging" by PolyPhen-2 c: annotated as "deleterious" or "damaging" by SIFT, or as "possibly damaging" or "probably damaging" by PolyPhen-2 d: annotated as "deleterious" or "damaging" by SIFT, and as "possibly damaging" or "probably damaging" by PolyPhen-2 We investigated the relationship between rare nonsynonymous variants and risk of CAD (table 3). 23 subjects carried variants that predicted to be damaging or deleterious in at least one program. 13 of them were identi ed in patient group and 10 of them were found in control group. 8 subjects carried variants that predicted to be damaging and deleterious in both programs, while 6 were identi ed in patient group and 2 in control group, respectively. 13 subjects carried variants that predicted to be deleterious by SIFT, while 9 were identi ed in patient group and 4 in control group, respectively. 18 subjects carried variants that predicted to be damaging by PolyPhen-2, while 10 were identi ed in patient group and 8 in control group, respectively. Patient group shown higher frequency of variants carriage status in all mutation sets. However, none of them reach statistically signi cant.

Discussion
Exome sequencing has been proven to be a powerful tool to uncover novel causal mutation of Mendelian diseases [17]. Recently, large-scale efforts have applied exome sequencing to study rare variants in complex disease [18]. Through targeted sequencing of coding region of SCARB1, Zanoni et al. showed that P376L carriers have a signi cantly higher HDL-C level and an increased risk of coronary heart disease [19]. Four rare variants in the coding region of apolipoprotein C3 (APOC3) that disrupt APOC3 function were found to be associated with lower plasma triglyceride levels and have a reduced risk of coronary heart disease [20]. Dewey et al. showed that carrying inactivating mutations in ANGPTL4 had lower levels of triglycerides and a lower risk of coronary artery disease compared with noncarriers [21]. Compound heterozygotes for two distinct nonsense mutations in ANGPTL3 resulted in decreased plasma LDL cholesterol levels and familial combined hypolipidemia [22]. Rare alleles at LDLR and APOA5 confer risk for early onset myocardial infarction [23]. Rare nonsynonymous variants can facilitate the exploration of disease pathogenesis, and provide supportive evidence for putative drug targets for novel therapies.
NGS based targeted sequencing of known disease genes and important candidate genes could identify not only disease causing variants but also variants of uncertain signi cance, which can be challenging for genetic counselling. In the present study, 33 nonsynonymous rare variants were identi ed using targeted sequencing. Two novel variants were identi ed in CAD cohort. One of them was a variant that introduces a missense mutation in ANGPTL4 (G47E), predicted to be deleterious by SIFT and probably damaging by PolyPhen-2. The other one was a variant that introduces a missense mutation in SCARB1 (L233F), predicted to be tolerated by SIFT and possibly damaging by PolyPhen-2. Three variants (rs200727689 in LDLR, rs730882109 in LDLR and rs768795323 in PCSK9) have been reported to be pathogenic or likely-pathogenic in ClinVar database, and all of them were identi ed in CAD cohort and linked to familial hypercholesterolemia in ClinVar database.
Rare variants having a population prevalence of <1% and may not be statistically associated with diseases of interest even in large samples. It was predicted that 27-29% of nonsynonymous mutations are neutral or nearly neutral, 30-42% are moderately deleterious, and the remainder are highly deleterious or lethal [11]. Our results show CAD patients carried more heterozygous nonsynonymous variants in all mutation sets, but the difference did not reach statistically signi cant. Larger sample sizes and functional researches are needed to clarify the impact of these variants.
In summary, we described a novel targeted NGS panel including 12 lipid metabolism genes. This panel is highly accurate to identify rare variants. This study suggested that targeted sequencing approaches can be used to discover rare mutations that contribute to the etiology of CAD risk, and may lead to the discovery of novel pharmaceutical targets for disease prevention and treatment. However, there are several limitations of this study. (1) This assay was designed to detect single nucleotide variants and small indels, and larger indels or structural rearrangements will be missed. (2) Whether these variants alter CAD risk remain unclear due to statistically underpowered. Larger sample size is needed to increase statistical power. (3) It is di cult to determine the pathogenicity of novel variants by computational methods alone. Functional testing may help to clarify the impact of these variants. (4) As an aging-related disease, subjects in control group might develop CAD in the future and lead to a misclassi cation bias [24]. Further studies are needed to validate these ndings and explore these variations as potential pathogenic mutations for CAD.

Conclusions
A total of 252 samples were recruited in this study, including 120 CAD cases and 132 normal health controls. The present study identi ed 33 nonsynonymous rare variants including two novel variants located in ANGPTL4 (G47E) and SCARB1 (L233F) gene, respectively. Association analysis showed CAD patients carried more nonsynonymous variants in all mutation sets, but did not reach statistically signi cant.
Targeted sequencing was a powerful tool to uncover rare variants in coronary artery disease. Clinical relevance of rare variants in CAD etiology needs to be investigated in future larger sample sizes.

Declarations
Con ict of interest statement The authors declare that they have no competing interests.