1.1 Data sources and cellular materials
Gene expression profile data (GEPs data), gene mutation data, and clinical information of prostate cancer patients were obtained from TCGA (https://portal.gdc.cancer.gov/cart; as of May 1, 2020). Download RNA-Seq FPKM data of 499 PCa samples, 52 normal prostate samples, and SNV mutation data of 440 cases of prostate cancer. Also, these data sets contain clinical data of 551 patients, including sex, diagnostic age, T stage, survival status, and survival time in days. The GSE3325, GSE55945 data sets of prostate cancer in GEO are all based on the GPL570 platform, with a total of 40 samples, including 26 tumor samples and 14 normal samples. Download the human genome annotation GTF file from the GenCode platform (http://www.gencodegenes.org/). We downloaded 2498 immune-related motifs through the immunology database and analysis website (ImmPort; https://www.immport.org/shared/genelists), which contains 17 immune categories based on various molecular functions [7,8]. The TSNAdb database (https://services.healthtech.dtu.dk/) contains somatic mutation data and human leukocyte antigen (HLA) allele information of 16 tumor types from 7748 tumor samples [9].
The main materials of cell experiment include PC-3 cells (prostate cancer cells, American Tissue Culture Collection);Fetal bovine serum (Hyclone, Logan, UT, USA);F12K medium, opti-MEM medium (GIBCO, Grand Island, USA);Trypsin (Gibco, Grand Island, NY, USA);Penicillin streptomycin (Sigma-Aldrich, St. Louis, MO);Anti-LTF antibody (ab109216), Anti-STAT3 antibody (ab119352), Anti-GM-CSF antibody (ab193345), Goat Anti-Mouse IgG H&L (HRP) (ab205719), Goat Anti-Rabbit IgG H&L (HRP) (ab205718), Goat Anti-Rat IgG H&L (HRP) (ab205720) (Abcam, Cambridge, UK);Lipofectamine 2000 Transfection reagent(Thermo Fisher Scientific, NY,USA);Prime STAR HS DNA polymerase (Takara, Beijing, China);Reverse transcription kit(1708843)(Bio-Rad Laboratories, California, USA);pIRES-neo3 Carrier(youBio, Changsha, China);AFL II, Not I (New England Biolabs (Beijing) LTD, Beijing, China).
2.2 Analysis of tumor immune cell infiltration based on CIBERSORT algorithm
R software CIBERSORT package (version 3.6.3) was used to evaluate the relative expression level of 22 infiltrating immune fine cell types in PCa samples, namely [10]: naive B cells (Bn), Bm, plasma cells, CD8+T cells, naive CD4+T cells (CD4+Tn), CD4+ resting memory T cells (CD4+ Tmr), CD4+ memory activated T cells (CD4+Tma), Tfh, Tregs, γδT, resting natural killer cells (NKr), activated natural killer cells (NKa), monocytes, M0 macrophages (M0), M1 macrophages (M1), M2 macrophages (M2), resting dendritic cells (DCr), DCa, resting mast cells (Mr), activated mast cells (Ma), eosinophils, and neutrophils. Through CIBERSORT deconvolution algorithm, the data of PCa GEPs matrix (including data from TCGA and GEO) are compared with 22 kinds of TIICs characteristic matrix data from CIBERSORT (https://cibersortx.stanford.edu/). The relative expression of 22 TIICs in each sample was determined. Significant results (P < 0.05) were selected for subsequent analysis.
2.3 Analysis of tumor microenvironment based on ssGSEA
Normalized PCa GEPs data were compared with the gene set using “GSVA” (R package). ssGSEA (single sample GSEA, http://software.broadinstitute.org/gsea/msigdb/index.jsp) classifies gene sets with common biological functions, chromosomal localization, and physiological regulation [12]. ssGSEA is an implementation method that is mainly proposed for a single sample that cannot be used for GSEA. The principle is similar to GSEA. ssGSEA calculates the rank value of each gene according to the expression profile file and then performs subsequent statistical analysis. We analyzed 29 immune-associated gene sets that represented diverse immune cell types, functions, and pathways. We used the ssGSEA to quantify the activity or enrichment levels of immune cells, functions, or pathways in the cancer samples. The following 29 types of immune-associated gene sets were obtained[13]: Mast cells, Type II IFN Reponse, Macrophages, Neutrophils, Parainflammation, CCR, Treg, APC co-stimulation, APC co inhibition, T helper cells, T cell co−inhibition, Check−point, T cell co−stimulation, TIL, DCs, HLA, pDCs, aDCs, Type I IFN Reponse, NK cells, Th2 cells, B cells, Tfh, iDCs, MHC class I, Th1 cells, CD8+ T cells, Cytolytic activity, Inflammation promoting. The PCa GEPs data were compared with the gene set using R software GSVA package (version 3.6.3). The results were divided into three groups according to immune activity clustering. Clustering analysis was realized by the R software sparcl package (version 3.6.2). Finally, the immune scores of the three groups of clustered data sets were scored by ESTIMATE algorithm [14], which were divided into high, medium, and low immunization evaluation groups. To predict the correlation between low, medium, and high immune scores and tumor purity, HLA, PD-1, and survival rate in prostate cancer, and to speculate the factors affecting the progression of PCa. The tumor microenvironment score of TCGA cohort GEPs data was calculated by ESTIMATE algorithm, including stromal cell score and immune cell score. According to the median score level, the tumor microenvironment score was divided into high score group and low score group, and then the difference was analyzed using the R software limma package (version 3.8) Wilcox test method. The genes that meet the critical criteria of P < 0.05and | log-fold-change | > 1.0 were screened.
2.4 Analysis of tumor mutation load based on bioinformatics
It is reported that highly mutated tumors are more likely to carry new antigens [16] that make them targets for activating immune cells, and a load of new antigens is related to the response to immunotherapy [17]. In this experiment, 440 PCa samples (including 187748 mutant genes) were extracted from the TCGA database, and the R software maftools package (version 3.6.2) was used to visualize the data. The TMB value of each sample of prostate cancer was calculated by the Perl language (https://www.perl.org/) (version v5.30.0) [18]. The PCa GEPs data were divided into high TMB group and low TMB group according to TMB value, and the relationship between TMB value and lymphatic metastasis was analyzed. Rank sum test was used to analyze the difference of PCa GEPs between high and low TMB groups, and the genes that met the critical criteria of P < 0.05and | log fold-change | > 1.0 were screened.
2.5 Acquisition of key genes in prostate cancer
First of all, the differential genes obtained by the microenvironment and TMB were analyzed by Venn analysis [19] (http://bioinformatics.psb.ugent.be/webtools/Venn/) to obtain the overlapping genes. Then, the immune-related genes and intersection genes were searched from the ImmPort database (https://www.immport.org/shared/genelists) and analyzed again by Venn, and the target genes related to PCa immune microenvironment were obtained. Finally, univariate logistic regression analysis was performed on the candidate genes based on the TNM staging data of prostate cancer respectively to evaluate the correlation between the screened genes and lymph node metastasis[20,21], and the highly correlated genes were most likely to be related to PCa progression, so it could be considered that this gene was the most likely target gene-related to immunity in the progression of prostate cancer.
2.6 Verification and Analysis of LTF Gene based on pan-cancer
Download 33 main tumor transcriptome data (FPKM), mutation data (varscan), and clinical data from the TCGA database on the UCSCXena website, and use Perl software to extract LTF expression matrix for the following analysis. The specific tumors are shown in Table 1.
The difference of LTF expression matrix between the normal group and tumor group was analyzed based on the Wilcox test, and the expression of LTF in each tumor was visualized by the R software ggpubr package (version 3.6.3).
The PFI survival status and survival time of each tumor in the clinical information of TCGA were extracted. Using R software survival, survminer, and forestplot package (version 3.6.3), the LTF expression matrix was combined with the PFI survival data of each sample. According to the median level of LTF expression, the samples were divided into high and low groups to obtain PFI survival curve. Finally, the effect of a gene on survival was described based on the Cox proportional hazards model, and a forest plot was drawn.
Perl software was used to calculate TMB values of mutation data from a total of 10114 samples of 33 tumors. Spearman packet (version 3.6.3) was used to test the correlation between LTF expression matrix and TMB values. Finally, FMSB packet (version 3.6.3) was used to visualize the TMB correlation of LTF in each tumor and draw the TMB correlation radar map. The TUMOR MSI data were extracted from TCGA, and a total of 10,416 MSI samples were collected. Then, the correlation test of LTF in each tumor MSI was performed with the above TMB correlation analysis method, and the results were visualized.
Microenvironment scores were calculated for 33 tumor samples and matched with LTF expression matrix. Finally, the correlation test of LTF expression matrix and microenvironment score was conducted based on The R language Spearman package (version 3.6.3), and the test threshold was P<0.05. The correlation of LTF in each tumor microenvironment score was visualized by ggPLOt2, GGPUBR, and ggExtra packages (version 3.6.3).
2.7 LTF gene enrichment analysis (GSEA)
To study the potential molecular mechanism of the key genes we screened, we carried out gene set enrichment analysis (https://www.gsea-msigdb.org/gsea/index.jsp) including KEGG analysis (C2), GO analysis (C5), and carcinogenic characteristic analysis of gene set (C6). Set the number of repeated permutations of GSEA software to 1000, pvalue < 5%, qvalue < 25% to filter the results of C2, C5 and C6. The GSEA analysis composite diagram is generated by R software packages "plyr", "ggplot2" and "grid".
2.8 Cell culture
PC-3 cells were subcultured in DMEM (GIBCO, Grand Island, USA) medium containing penicillin (final concentration 100 U/mL), streptomycin (final concentration 100 μ g/mL), and 10% FBS (Hyclone, Logan, UT, USA).
2.9 Plasmid construction
2.9.1 Total RNA extraction and PCR amplification
Total RNA was isolated from tumor tissues and cells with Trizol reagent (Invitrogen, USA). Reverse transcription was used to obtain cDNA, in 20 μ l system, and an appropriate amount of cDNA was used in the PCR experiment, and PCR amplification was performed in a 25ul system. All the primers used were by the principle of primer design. The primers were synthesized by Nanjing Jin si Rui Biotechnology Co., Ltd. (youBio, Changsha, China). The specific primers are shown in Table 2.
2.9.2 Ligation transformation、colony PCR identification、Plasmid extraction、and cell transfections
The vector linearized by double enzyme digestion and annealed double-stranded DNA were connected by T4DNAligase. The identified positive clones were inoculated in LB liquid medium containing Amp antibiotics, cultured at 37 ℃ for 16 hours, and the appropriate amount of bacterial solution was sequenced. The plasmid was extracted from the bacterial solution with correct sequencing, and 0.8 μ g plasmid was dissolved in 50 μ lopti-MEM serum-free medium to prepare reagent A. Reagent B was prepared by dissolving 2 μ l lipo2000 in 50 μ l opti-MEM serum-free medium. The follow-up experiment was carried out 48 hours after transfection. The sequence of the carrier is shown in figure 2.
2.9.3 Western blotting
Protein samples were extracted using RIPA cell lysis buffer and protease inhibitor and phosphatase inhibitor (Beyotime Biotechnology, Shanghai, China). The total protein content ((Beyotime Biotechnology, China)) was determined by BCA method. The protein samples were electrophoretic on 10% sodium dodecyl sulfate-polyallylamide (SDS-PAGE) gel, transferred to polyvinylidene fluoride (PVDF) membrane, sealed with 5% skimmed milk, diluted with TBST to an appropriate concentration of primary antibody (LTF, STAT3, 1: 5000 GM-CSF, 1: 500), and incubated overnight at 4 ℃. Cut open the self-sealing bag and wash the film with TBST 3 times. Then the cells were incubated with an appropriate concentration of secondary antibody (1RV 5000) at room temperature for 1 h. Finally, the luminous imaging station Tanon 6600 is used to detect the optical density, and the Image-Pro Plus software is used to analyze the optical density. According to the statistical difference between groups, one-way ANOVA and Tukey, s test was used, P < 0.05. It was considered that there was a significant difference.