Gravida Serum POSTN and PAPPA as the Potential Biomarkers for Fetal Congenital Heart Disease Based on Quantitative and Qualitative Proteomics


 Background: Fetal congenital heart disease is the most common congenital defect worldwide. It has some missed diagnoses and a lack of disease biomarkers. We aim to seek objective biomarkers for noninvasive prenatal diagnosis of fetal congenital heart disease to reduce the missed diagnosis and explore its mechanism,Methods: This study used data-independent acquisition and parallel reaction monitoring to explore potential protein biomarkers that co-exist in gravida serum and fetal amniotic fluid. Moreover, logistic regression and ROC curve to establish the diagnostic model of fetal congenital heart disease potential biomarkers and molecular biology experiments were performed to validate proteomics results and explore the mechanism. Results: Proteomics and bioinformatics results show that 12 proteins co-exist in gravida serum and amniotic fluid. We identified POSTN and PAPPA in GS as candidate biomarkers and established the diagnostic model with a sensitivity of 100%, a specificity of 95% and the AUC value of 0.968 to diagnose congenital heart disease. In addition, the results of ELISA, IHC, and RT-PCR were consistent with those of proteomics. Moreover, POSTN may involve in fetal congenital heart disease through the TGF-beta signaling pathway. Conclusion: It is the first time to find that POSTN and PAPPA in GS are related to fetal congenital heart disease, contributing to developing a novel noninvasive prenatal method to diagnose fetal congenital heart disease and reduce congenital disabilities.


Introduction
Congenital heart disease (CHD) is the most common congenital defect worldwide [1,2,3]. It accounts for one-third of all major congenital anomalies worldwide [4]. The ventricular septal and atrial septal defects are the most common CHD, with an incidence of about 5.29/1000, accounting for 19.6% for all CHD [2].
As we all know, prenatal diagnosis of CHD mainly depends on echocardiography. It is easily in uenced by gestational week, fetal position, amniotic uid (AF), and equipment, all of which can lead to missed diagnosis (20% ~ 30%). Moreover, fetal CHD was diagnosed by ultrasound already in the second trimester or later. Therefore, it is necessary to reduce the missed diagnosis and seek early diagnosis methods through objective biomarkers.
As we all know, AF is directly involved in fetal circulation, and the composition of AF can accurately re ect the development of the fetus. However, amniocentesis to obtain AF for diagnosing fetal diseases is invasive, which is somewhat risky for pregnant women and fetuses. While maternal blood intersects with fetal circulation through the placenta and umbilical cord, and maternal peripheral blood collection is safe and easy. Furthermore, proteins are the executors of life activities. So, we try to seek the protein biomarkers that co-exist in both AF and gravida serum (GS) to help prenatal diagnosis of fetal CHD.
Proteomics is a crucial method to discover biomarkers of diseases. Data-independent acquisition (DIA) collects data from a single sample and enables all peptides to be fragmented and analyzed in a given m/z window, enabling complete recording and highly reproducible quanti cation of all MS2 scans [5].
Parallel reaction monitoring (PRM) can monitor all products of the target peptide and screen out all other proteins and peptides in the sample to enhance the speci city and sensitivity of the quanti cation and validate one or several speci c proteins. In our research, DIA was used to screen differentially expressed proteins in 30 samples of training sets (GS and AF from 9 cases of CHD group and 6 cases of the control group) and to verify these proteins in another 28 samples (GS and AF from 9 cases of CHD group and 5 cases of the control group) by PRM. Furthermore, these potential biomarker proteins were detected by ELISA in another 104 samples (GS and AF from 26 cases in the CHD group and 26 cases in the control group). Besides, we adopt immunohistochemistry (IHC) (4 cases in the CHD group and 5 cases in the control group) and RT-PCR (both 5 cases in the CHD group and the control group) with fetal myocardial tissue further to verify the expression differences of these potential biomarkers and try to nd the signal path. Inclusion criteria for the CHD group: (1)fetal CHD was diagnosed by echocardiography; (2)The pregnant women naturally conceived were between 18 and 45 years old; (3) Couples are not close relatives. The inclusion criteria for the control group: (1)meet the above(2)(3); (2)The pregnant women were only older (≥35) or the false positive of NIPT/screening for Down Syndrome or NT ≥ 0.25cm, whose prenatal diagnosis results were normal. Exclusion criteria: (1) Fetus with extracardiac malformation; (2)Pregnant women have suffered from rheumatic heart disease, myocarditis, other non-structural heart diseases, hepatitis, abnormal renal and liver function, diabetes, or other acute or chronic infectious diseases.

Study design and inclusion criteria
Sample collection and preservation: (1) AF(2ml) were taken from pregnant women before amniocentesis, and corresponding maternal peripheral blood (2ml) was taken at the same time. AF supernatant and GS were taken out after centrifuge (4° C, 1000rpm/min,10min) and then stored at -80 ° C. (2) Fetal heart tissue: the heart tissue (2 ~ 4mm 3 ) of induced labor fetus is taken. One part of the myocardial tissue was xed in 4% formaldehyde for 48 hours before para n embedding, while the other was frozen in the refrigerator at -80 ° C prepared for RT-PCR.

LC-MS / MS
GS was taken to remove the high peak protein (Thermo Fisher company, No:85165, kit column), so the total protein solution of AF can be obtained. All samples were enzymolized and desalted with an equal amount of protein. The peptides were redissolved with mobile phase A and then mixed evenly. Agilent 1100 HPLC system was used to separate the components in mobile phase pH=10. The peptide mixture was loaded by Agilent Zorbax Extend -C18 capture column (2.1× 150 mm, C18, 5 µm). The ow rate was 2%B; Mobile phase A=2%ACN and B=90%ACN). Mobile phase A and B ammonia was used to adjust pH to 10. Samples were collected for 8 to 50 minutes, one tube was collected every one minute in turn, and then the samples were collected in this order until the end of the gradient. A total of 10 components were obtained. After collecting, the samples were freeze-dried in a vacuum and drained. Before mass spectrometry detection, each sample and IRT standard (Biognosys, ThermoFisher) were mixed in the ratio of 1:10, and the coe cient of variation of IRT was less than 10%.
DDA mode database construction LC-MS/MS analyses are performed on a Q-extraction HF mass spectrometer (Thermo, USA). The peptide mixture was loaded by capillary C18 capture column. The analysis column was C18 reversed-phase column (ChromXP Eksigent). The ow rate was 300nL/min, and the linear gradient was 90 min. Full mass spectrometry scans were obtained in the mass range of 350-1650m/z with a mass resolution of 120,000 and an AGC target value of 3E6. All MS/MS spectra were collected using data-dependent high-energy collision pyrolysis in positive ion mode, and the collision energy was set to 27. MS/MS resolution was set as 30000, automatic gain control was set as 2E5, and the maximum injection time was 80 MS. The dynamic exclusion time was set to 40 s.

spectrum matching and database building
The machine signal is transformed into peptide and protein sequence information by matching the mass spectrum output with the theoretical spectrum generated by the FASTA library. Then the spectrum library is established by combining the sequence information, peptide retention time, and fragment ion information to facilitate the subsequent DIA analysis. Finally, the original LC-MS / MS les are imported into Spectronaut pulsar software to match and build the database. Database: UniProt-reviewed-Homo sapiens (Human)-20200817. The speci c parameters of DDA and DIA are shown in supplement S4.

PRM
All samples were pre-scanned with the same amount of mix to correct the retention time of peptides. Based on the results of qualitative analysis, the identi ed target peptides were screened, and the trusted peptides were retained. PRM mode detection: The ow rate was 300nL/min, and the linear gradient was 90 min. MS/MS resolution was set as 30000 and an AGC target value of 1e5. Isolation Window 1.2m /z and the maximum injection time were 100 MS. The trusted peptides suitable for PRM analysis were introduced into the mass spectrometry software Xcalibur. After three times of PRM detection, the PRM original le data were PRM original le's data were analyzed by Skyline 3.7.0 software, and the target proteins and the target peptides were quanti ed.

ELISA
POSTN (A102870-96T, Shanghai Fusheng Industrial Co., Ltd, China) and PAPPA (HM11110, Bioswap, China) in GS and AF were measured using commercially available ELISA kits. The experimental operation has been executed according to the corresponding instructions.

Immunohistochemistry (IHC)
IHC was performed on para n-embedded tissue sections. After being dewaxed, hydrated, repaired with EDTA buffer, and blocked with 5% BSA, tissues were incubated overnight at 4 ° C with primary antibodies to detect POSTN (ab219057, Abcam, USA). After incubation with HRP Goat Anti-Rabbit IgG (SAB4591, Bioswamp, China), slides were visualized by DAB-Substrate (Beyotime, China) and photographed by the Aperio ePathology Scanner (Leica, Germany). ImageJ software analyzed protein expression. 2.7 RNA isolation, cDNA synthesis, and RT-PCR Total RNA was isolated by Invitrogen™ TRIzol™ Reagent (Thermo Fisher Scienti c, USA), and cDNA was synthesized from the total RNA using PrimeScript™RT Master Mix (Takara, Japan). RT-PCR was performed on the Bio-Rad CFX96 (Bio-Rad Laboratories, USA) using SYBR Premix Ex Taq™(Takara, Japan). The sequences of primers are in supplement S4. The relative expression of the mRNA was calculated by the 2 -△△Ct .

Statistical analysis
Data analysis and the ROC curve analysis were processed by SPSS 22.0 (SPSS Inc, Chicago, IL). Metrological data were presented as means ± SD. An independent sample T-test was adopted when normal distribution and homogeneity of variance were met. Otherwise, Mann-Whitney U was performed.
Fisher's exact test was utilized to compare the distribution of GO categories or KEGG pathways between the target protein set and the overall protein set and to perform enrichment analysis on the GO categories or KEGG pathways of the target protein set. P < 0.05 is considered statistically signi cant.

The basic data
There was no signi cant difference between the two groups in age, NT, NIPT / Down screening results of patients in DIA, PRM, and ELISA sets. In ELISA sets, the gestational age in the CHD group was signi cantly older than that in the control group (P < 0.05). Because some control group samples have amniocentesis before the 4D Ultrasound examination (about 20~24week), while the CHD samples were often collected after that time.
3.2 Candidates differential protein from DIA 3.2.1 Trusted protein and Differential protein expression analysis A total of 754 proteins and 6945 peptides were identi ed in serum, including 719 trusted proteins. 2007 proteins and 13101 peptides were identi ed in AF, including 1876 trusted proteins. The differential protein was screened based on trusted protein, according to foldchange = 1.2 times and p-value < 0.05 (T-test) between the two samples. 115 differential proteins (72 up-regulated and 43 down-regulated) were identi ed in serum, and 442 differential proteins (237 up-regulated and 205 down-regulated) were found in AF.
Among them, we focused on 27 differential proteins (Figure 1-A) that co-exist in serum and AF, and 19 proteins (16 proteins up-regulated, three proteins were down-regulated) showed a consistent expression trend. GO and KEGG enrichment analysis of these 19 proteins was performed according to the two FC mean values. Most proteins are enriched in bioadhesives and regulation, development process, extracellular matrix (or extracellular region), catalytic activity, enzyme regulatory activity (Figure 1-C and 1-D).

PRM validation of the protein expression
We selected 36 differential proteins (18 proteins co-exist in serum and AF, 13 only in AF, and 5 in serum) for PRM veri cation based on DIA results. There are 23 different proteins veri ed accurately ( gure 2-A).
Moreover, 12 proteins co-exist in serum and AF, including PAPPA, were up-regulated, and POSTN was down-regulated (P < 0.05) ( gure 2). These protein expression trends were consistent with DIA results, which indicated that our proteomic data were reliable.

Elisa validation and ROC analysis
We focused on PAPPA and POSTN according to the results of DIA, PRM, and cardiovascular literature. To evaluate the expression level and e ciency of PAPPA and POSTN in prenatal diagnosis of fetal CHD, 104 samples (GS and AF) were further veri ed by ELISA, and the multivariate logistic regression and ROC curve analysis were performed (Figure 3). Compared with the control, the level of PAPPA in GS and AF of CHD group was signi cantly higher (0.835±0.029 ng/ml VS 0.754±0.033 ng/ml P<0.001; 0.811±0.017 ng/ml VS 0.721±0.014 ng/ml, P<0.001). The level of POSTN in GS and AF of the CHD group was lower than those in the control group (10. 738 ± 8.799 ng/ml VS 17.033 ± 14.057 ng/ml, P=0.23; 5.556 ±1.487 VS 7.237 ±2.194 ng/ml P=0.002), although signi cantly down-regulated in CHD group only in AF samples. From a noninvasive point of view, we performed ROC curve analysis of PAPPA and POSTN in GS instead of AF. The sensitivity of PAPPA in GS was 90 %, and speci city was 0.95 %, and the area under the curve (AUC) values were 0.965(P =0.000, 95% CI: 0.913, 1.00). The sensitivity of POSTN in GS was 80%, and speci city was 55%, and AUC was 0.634(P =0.148, 95% CI: 0.904, 1.00). However, Combining PAPPA with POSTN in GS, the sensitivity was 100%, the speci city was 95%, and the AUC was 0.968(P =0.000, 95% CI: 0.904, 1.00). Obviously, the diagnostic model of POSTN combined with PAPPA in GS was more suitable for prenatal diagnosis of fetal CHD (Purple dotted line in Figure 3-B).

IHC and RT-PCR
IHC and RT-PCR were performed to analyze the differential expression of POSTN protein and the mRNA of TGFβ signal pathway-related molecules between the CHD and control groups. IHC results showed that the expression of POSTN protein in the CHD group was signi cantly lower than that in the control group (P < 0.05) (Figure 4-A), which is consistent with the expression trend of POSTN in AF and GS. RT-PCR results showed that the relative expression of POSTN mRNA, TGFβ1 mRNA, Smad1 mRNA, Smad2 mRNA, and Smad3 mRNA were signi cantly lower than those in the control group (P < 0.05) (Figure 4-B). The primers involved in RT-PCR were shown in supplement 4.

Discussion
It is the rst time to seek fetal CHD potential protein biomarkers co-exist in GS and AF by combining DIA with PRM. Compared with the control group, DIA, PRM, and ELISA results showed that POSTN in serum and AF of the CHD group was signi cantly down-regulated, while PAPPA was not signi cantly upregulated. Similar results were also obtained from IHC and RT-PCR using fetal myocardial tissue. In addition, PPI results showed an interaction between POSTN and PAPPA (Figure 1-E). Studies also have shown that POSTN and PAPPA were related to cardiovascular development. Therefore, we did further research on POSTN and PAPPA as potential fetal CHD biomarkers. POSTN(Periostin) is in the extracellular membrane, which involves cell-matrix interaction, cell fate determination, migration, and proliferation in development [7,8]. It is closely related to cardiovascular diseases, such as congenital heart valve disease, ventricular remodeling after myocardial infarction, myocardial brosis, etc.
[8]. Therefore, it has been considered a new biomarker of cardiovascular diseases [9]. Although the periostin is expressed in early embryonic development, valves, out ow tract, atrioventricular tissues, and heart injury or disease [10][11][12], it is hardly expressed in healthy adult myocardium. Mesenchymal cell lines express periostin during heart development, promoting the formation of the atrioventricular out ow tract, valve, and chordate tendineae [11]. It is necessary for mesenchymal cells to transform into cardiac broblasts (CFS). Increased periostin expression was detected during embryonic development in developing atrioventricular tissues [12]. CFS chemotaxis is weakened in the absence of periostin, and the mesenchymal cells will transform into other mesoderm cells, like cardiomyocytes [13]. In our study, POSTN was signi cantly down-regulated in the CHD group, which may be related to the weakened ability of migration, differentiation, and proliferation in cardiac mesenchymal cells during cardiac development.
TGF-β/ Smads signaling pathway is involved in cardiac development, and POSTN is the downstream response factor. That exogenous treatment of primary cardiac broblasts and vascular smooth muscle cells (VSMCs) with recombinant TGF-β1 promoted the expression of Periostin via canonical SMADdependent signaling [14,15]. Periostin -/mice were susceptible to matrix strati cation disorder, reduced TGF signal transduction, protein-polysaccharide aggregation, discontinuous valve lobules, and delamination defects [16]. Lack of Periostin can lead to low deposition and maturation of collagen bers in valve [17], atrial septal defect [13], abnormal structure of atrioventricular valve and supporting tensors, and abnormal differentiation of atrioventricular cushion stroma [13]. Furthermore, our RT-PCR results showed that the expression of TGFβ1-mRNA, Smad2/3 mRNA, POSTN mRNA in the CHD group was signi cantly lower than that in the control group, which suggested that POSTN may participate in the occurrence of CHD through the TGF-β/ Smads signaling pathway and that the decreased expression of POSTN might result from inhibition of TGF-β/Smads signaling. However, some "gain or loss of function" experiments are necessary to prove it.
PAPPA, a secreted metalloproteinase, can speci cally cleave IGFBP-4 (mainly) and IGFBP-5, then release the IGF. However, IGF (1/2) can promote the proliferation of mesoderm cells, thus promoting the formation of cardiac progenitor cells (differentiated into cardiomyocytes and vascular cells) [18]. PAPPA is concerned with pregnancy-related diseases and involves many cardiovascular diseases, considered a candidate marker of cardiovascular disease [19,20]. Interestingly, PAPPA also interacts with TGF-β signaling pathway downstream factors Smad2, Smad3, and Smad9 [21], consistent with our PPI results.
Our proteomic results showed an interaction between POSTN and PAPPA ( gure1-E). Both POSTN and PAPPA were expressed in broblasts and mainly located in the extracellular matrix (ECM). The ECM contains many structural and non-structural proteins, interacting dynamically with unique and differentiating cell types which serve as a reservoir and processing site for signaling molecules and forming communication corridors for protein and genetic information. Therefore, the change of ECM components will disturb the signal transmission in the heart and lead to CHD. In our study, POSTN is signi cantly down-regulated, PAPPA is up-regulated, indicating that the dynamic balance and signal transduction may be disrupted. Eventually, abnormalities in cells' biological behavior result in abnormal cardiac structural development. In addition, PAPPA interacts with Smad2, Smad3, which may mean that PAPPA interacts with POSTN by TGF-β/Smads signaling pathway. Also, PAPPA is likely to serve as a critical molecular link between IGF and TGF-β signaling. However, this speci c interaction mechanism needs further exploration.
By contrast, it avoids the bias of data-directed analysis (DDA) in the collection and fragmentation of high abundance peptides, DIA has signi cant advantages in the quanti cation of low abundance peptides. Furthermore, PRM can accurately quantify and replace the non-speci c, low-throughput technologies based on immunoassay and obtain higher resolution and quantization accuracy. We combined DIA with PRM, bioinformatics analysis, and basic molecular biology techniques to seek out potential biomarkers of fetal CHD, such as POSTN and PAPPA.
Some advantages of this study are as follows: First, the potential biomarkers of fetal CHD co-exist in gravida serum and AF can genuinely re ect fetal development and be applied to clinic noninvasive prenatal diagnosis. Second, our results are representative and reliable. After screening the potential protein markers by DIA, PRM and ELISA were used to validate these candidate proteins in different sample sets. Furtherly, molecular biology technology was applied to verify these candidate proteins. Third, the related signal pathway and the potential mechanism of POSTN involved in fetal CHD were explored initially. Nevertheless, this study has some shortcomings: it is challenging to collect large samples in a limited time. The sources of these potential biomarkers in AF and maternal serum need to be further explored microscopically. The speci c mechanism of these biomarkers needs to be further explored through cell and animal experiments.

Conclusion
CHD is the most common congenital disability worldwide and lacks objective diagnostic biomarkers. However, we rst found that POSTN and PAPPA in maternal serum are related to fetal CHD, and there was an interaction between POSTN and PAPPA, which the TGF β/Smads signaling pathway may mediate.
The sensitivity and speci city of the combined diagnostic model of POSTN and PAPPA in GS were 100% and 95%, respectively; the AUC value was 0.9684. Our results can contribute to developing new methods for noninvasive prenatal diagnosis of fetal CHD and may lay a foundation to reduce congenital disabilities and elucidate the mechanisms. 1-E shows the PPI of 27 different proteins, in which dots represent differential proteins/genes, the red dots represent up-regulated, the green dots represent down-regulation; the dot represents the connection degree, and the larger the dot indicates the higher the connection degree.

Figure 2
The results from PRM. 2-A shows PRM identi ed the differential proteins. 2-B and C show the box plot of PAPPA and POSTN, respectively. The x-axis and Y-axis, respectively, show the sample grouping and the protein expression value intensity. *p < 0.05, ** p < 0.005, and NS mean no signi cant difference.

Figure 3
The average level POSTN and PAPPA from Elisa and the regression and ROC analysis. 3-A shows the average level of POSTN and PAPPA in serum and AF(*p < 0.05 ** p < 0.01). The prenatal diagnosis of fetal CHD model with PAPPA combined with POSTN in GS was shown in the purple dotted line in gure 3-B (sensitivity 100%, speci city 95%, AUC 0.968, cutoff value 0.5156736, P < 0.05, 95% CI: 0.904, 1.00).