A fine-grained cell type deconvolution of bulk transcriptome from single-cell RNA-seq data reveals myeloid cells heterogeneity in lung adenocarcinomas

Background: Tumor infiltrating myeloid cells (TIM) constitute a vital element of the tumor microenvironment. The cell-type heterogeneity of TIM has yet to be fully investigated. Methods: We used a time saving approach to generate a single-cell reference matrix, allowing quantification of cell-type proportions and cell-type-specific gene abundances in bulk RNA-seq data. Results: Two distinct clusters, MSC1 and MSC2 (MSC subtype) were newly identified in lung adenocarcinoma (LUAD) patients, both significantly associated with overall survival and immune blockade therapy responses. Twenty myeloid cell types were detected. Thirteen of these had distinct enrichment patterns between MSC1 and MSC2. LAMP3+ dendritic cells, being a mature and transportable subtype of dendritic cell which may migrate to lymph nodes, were noted as associated with non-responsiveness to immunotargeted therapy. High infiltration level of IFIT3+ neutrophils was strongly related to the response to immune targeted therapy and was seen to activate CD8+ T cells, partly through inflammasome activation. The infiltration levels of TIMP1+ macrophages and S100A8+ neutrophils were both significantly associated with poor prognosis. TIMP1+ macrophages were noted to recruit S100A8+ neutrophils via the CXCL5-CXCR2 axes and promote LUAD progression. Conclusion: Altogether, we performed virtual micro-dissection of the bulk transcriptome at single-cell resolution and provided a promising TIM infiltration landscape that may shed new light on the development of immune therapy.


Introduction
Lung adenocarcinomas (LUAD) account for over 40% of lung cancers and represent its most leading and prevalent histological subtype. Despite an improvement in therapeutic strategies, the rates of objective clinical responses remain low, with only 17.4% lung cancer patients surviving more than five years beyond diagnosis. In this, the dynamic tumor immune microenvironment plays an important role in tumor progression and metastasis [1].
Tumor-infiltrating T lymphocytes are now recognized as the key components of the tumor microenvironment (TME). Therapeutic strategies for targeting these cells are being actively developed and have demonstrated remarkable therapeutic effects [2]. Whilst current immunotherapies targeting T lymphocytes benefit only few patients [3], it is important to unravel the exact cellular functions of the remaining cell types within the TME that may be involved in tumor progression.
Recently, there has been much focus on cancer immunology, with primary emphasis on myeloid cells as important components in tumor immune evasion [4,5]. However, the various reports that ascribe macrophages and neutrophils with either pro-or anti-tumor properties, together with acknowledgement that our understanding of tumor infiltrating myeloid cell (TIM) subtypes is quite inadequate [6,7], leads to a high potential for confusion and/or contradiction within this field. Despite this, some strategies targeting myeloid cells have been developed [8,9]. However, the limited understanding of clear mechanistic hypotheses had led to difficulties in interpreting clinical outcomes for such approaches [10]. In particular, the complexity of TIM subtypes and the discrepancies between human and mouse models has deeply impeded the implementation of selective myeloid-targeting immunotherapies.
Single-cell RNA sequencing (scRNA-seq) offers an opportunity to dissect the complexity of the TME, enabling the identification of the cell state in a manner independent of any previous knowledge of cellular markers. Single-cell analysis has been applied to reveal the cellular heterogeneity of TIMs, including tumor associated macrophages (TAMs), dendritic cells (DCs) and neutrophils in different cancer types [11,12]. However, due to high cost and strict requirement for cellular activity, analyses of large patient cohorts have been almost impossible. Computational algorithms, which allow for the estimation of relative cell infiltrate level based on bulk RNA sequencing (RNA-seq) and scRNA-seq data, have now been developed and may compensate for this. CIBERSORTx is a computational framework to accurately infer relative abundance of cell type from bulk RNA-seq according to the signature matrix generated from scRNA-seq by means of a deconvolution algorithm [13]. It has been successfully used and validated for revealing immune cell landscapes in melanomas [14], clear-cell renal cell carcinomas [15] and prostate cancer [16]. Unfortunately, CIBERSORTx does not seem to provide a standard procedure or pipeline describing how to integrate scRNA-seq data to construct a reference matrix, as it just uses all the scRNA-seq data as the input. As such, this procedure requires substantial computational resources and time to handle the huge amount of the data. Although down-sampling can address this problem, scRNA-seq data usually suffers from the problem of extremely high dropout rate, especially for those generated by 10x Genomics. Thus, the reference matrix generated by down-sampling is often unhelpful for such considerations and will result in a poor deconvolution effect.
In this paper, we applied a timesaving approach to create a customized reference matrix for scRNA-seq data for myeloid cells and made a deconvolution of the TCGA LUAD cohort.
As a result, we identified for the first time two different enrichment patterns (called MSC subtypes hereafter). A series of analyses, including survival analysis and multivariable Cox regression analysis with clinical features, revealed MSC subtypes to be robust prognostic factors. We reveal the relationship between MSC subtypes and immune checkpoint blockade (ICB) therapy and identify three TIM subtypes that might contribute to the ICB response.
Finally, we explore the heterogeneity of macrophages and detected a functional relationship between macrophage and neutrophil subtypes.

Data source and pre-processing
The single-cell transcriptome file of five LUAD patients and the validation data for DCs' distribution were downloaded from the Gene Expression Omnibus (GEO) database under the accession number GSE127465 [17] and GSE131907 [18]. The transcriptome expression profiles and corresponding clinical information for LUAD were retrieved from the Genomic Data Commons Data Portal of The Cancer Genome Atlas (TCGA). Expression data were converted from counts type to transcripts per million (TPM). Three transcriptional microarray expression data (GSE matrix files) for LUAD cohorts (GSE31210 [19], and GSE72094 [20]) were obtained from the GEO. The microarray datasets were log-transformed (on a base 2 scale) and genes were detected with more than one probe retaining its maximum value. Figure 1 shows a conceptual view of our fine-grained cell type deconvolution and its application to LUAD analysis.

scRNA-seq data and single-cell data analysis
The Seurat package (version 3.0) was used to perform scRNA-seq analysis [21].
Transcriptomes with more than 300 total counts, less than 10,000 total counts and less than 20% of counts coming from mitochondrial genes, were retained for subsequent analysis. From the remaining cells, gene expression matrices were normalized to the total unique molecular identifier (UMI) counts per cell and were log-transformed (on a base 2 scale). Dimensionality reduction was performed with uniform manifold approximation and projection (UMAP). The marker gene of each cluster was identified using Seurat.

Single cell reference matrix construction
The most characteristic cells in each cell subtype, rather than all tens of thousands of cell's data, were selected to create the custom signature matrix. We defined and calculated a so-called cell-type-specific signature score SigScore to select the candidate cells from each cell subtype. We ranked all cells belonging to a special cell subtype in descending order of their SigScore and chose the top 50 cells to create the custom signature matrix. In addition, we created another custom signature matrix by randomly selecting 50 cells of each cell type, which would then be used as another matrix for comparison.

Infiltration estimation of the myeloid cells
ScRNA-seq data of the top 50 cells was uploaded to CIBERSORTx (http://cibersortx.stanford.edu) to create a customized signature matrix for each myeloid cell subtype. A hundred simulated bulk datasets were created by random sampling of different numbers of each cell types (including non-myeloid cells) and these were used to validate the signature matrix. Since count data were uploaded for creating the signature matrix, count-per-million (CPM) data of the TCGA LUAD cohort within the genes involved in the signature matrix were then generated to estimate the abundance of myeloid cell subtypes in CIBERSORTx.

MSC subtype identification and prediction model building
Based on the infiltration level of myeloid cell subtypes, the optimal number of TCGA LUAD cohort clusters were examined using the mclust package (version 5.4.5) [22]. K-Means consensus clustering was conducted in R to determine distinct clusters of MSC subtypes. The least absolute shrinkage and selection operator (LASSO) algorithm was used to reduce the data dimensions and distinguish the most informative genes for predicting the MSC subtype using the glmnet package (version 3.0-2) [23]. Finally, the MSC score formula was calculated by considering the correlation estimated Cox regression coefficients: where Z denotes the gene number determined by LASSO, [ ] denotes the expression level of gene i, and represents the coefficient of gene i as determined by LASSO.

Potential drug targeted gene set selection
To identify the potentially druggable therapeutic targets for the patients of identified MSC subtypes, we collected two datasets, the genome-scale CRISPR knockout screens dataset in Project Achilles (https://depmap.org/portal/) and drug-induced gene expression profiles from https://commonfund.nih.gov/LINCS/) L1000 dataset. We then performed a two-step analysis to identify candidate drugs. Firstly, we filtered the essential genes for LUAD cell lines based on the Genome-wide CRISPR gene essentiality scores (CERES) from Project Achilles. The genes whose CERES was lower than -0.5 in half of the total LUAD cell lines were retained and then intersected with the up-regulated genes in MSC2 patients. We then ranked the gene expression profiles of LUAD cell lines obtained from LINCS and performed Gene Set Enrichment Analysis (GSEA) using clusterProfiler package based on the above target gene set [24]. Only when the target gene set was significantly enriched in the bottom of ranked gene list, the drug was then considered to have potential.

Calculation of ligand-receptor interaction
For the cell-cell interaction analysis, the expression level was normalized according to the total reads count and converted into a TPM-like scale. The expression values were averaged within each cell subtype. We retrieved the ligand-receptor pairs from a systematic research including known ligand-receptor pairs from the existing databases and predicted the ligand-receptor pairs with high confidence [25]. The threshold of 1 TPM was used as the cut-off for ligand-receptor pairs within each cell subtype for further analysis.

Statistical analysis
The differentially expressed genes (DEGs) were calculated using the DESeq2 package for R [26]. DEGs satisfying |log2FoldChange| > 1.5 and adjusted P-value < 0.05 criteria were considered statistically significant. Clusterprofiler was used to preform GO function enrichment and KEGG pathway annotation. Within a specific cohort, patients were divided into two groups based on the mean value of in all samples. Survival curves were constructed using the Kaplan-Meier (KM) method and compared using the log-rank test provided in the survival package for R [27]. Multivariate Cox proportional hazard regression modeling was used to verify the prognostic significance for OS. Histologic grade, gender, and age were used as variables. To identify the relationship between clinical state and myeloid cells, we queried the clinical data of the TCGA LUAD cohort. In particular, the TMN stages were categorized to a numeric level. The correlation between the infiltration level of myeloid cells and clinical variables was examined using Pearson's correlation coefficient (CC), which was considered statistically significant by FDR < 0.05. Area under the curve (AUC) of the receiver operating characteristic (ROC) curve was used to assess the predictive ability of the predicted signature.

Construction and validation of -based reference matrix
We downloaded the scRNA-seq data from Zilionis et al [17] in which the authors demonstrated the major aspects of the lung tumor immune microenvironment. Twenty subtypes of myeloid cells were identified using Seurat ( Figure S1A). The corresponding scRNA-seq data of the top cells with the highest to CIBERSORTx were then uploaded and the underlying reference matrix obtained.
We examined the -based reference matrix compared to a randomly selected reference matrix within 100 simulated datasets. Our reference matrix showed better performance ( Figure 2A).

Identification of two distinct myeloid cell infiltration subtypes in LUAD patients
We applied the -based reference matrix to investigate the fractions of infiltrated myeloid cells in the TCGA LUAD dataset ( Figure S1B). Among the total samples, 485 tumor samples were eligible for CIBERSORTx under P-value < 0.05 and CC > 0.5. We performed K-means clustering with the optimal number (k = 2) and identified two distinct myeloid cell infiltration clusters, namely MSC1 and MSC2, according to the contextures of the myeloid cells ( Figures S1C and S1D). Thirteen of 20 had distinct enrichment patterns between MSC1 and MSC2 (|log2FoldChange| > 1.25, P-value < 0.05) ( Figure 2B). MSC2 was significantly associated with a shorter OS compared with MSC1 ( Figure 2C, P-value = 0.00074). GO analysis suggested that the MCS1 subtype, with its favorable outcome, has a stronger immune response ability including both innate and adaptive immunity. Conversely, the MCS2 subtype was significantly associated with the cell cycle and negative regulation of cell apoptosis, in line with unfavorable outcomes ( Figure 2D).
We then queried the distribution of distinct TIM cell types within tumors and adjacent tissues. The statistical result is showed in Table S1. Some of those cell types are strongly related with OS. A high fraction of LAMP3+DC3, S100A8+N3, PI3+N4, TIMP1+M3 and TUBB+Mcyl were identified as poor prognostic factors, while high fraction of CLEC9A+DC2 were identified as a protective ( Figure S2).
We then observed that SELENOP+M4, the most enriched cell type in MSC1, was the subtype of macrophages which highly expressed SELENOP and TM4SF1 mRNAs.
TUBB+Mcyl, the most enriched cell type in MSC2, was the subtype of the monocytes which highly expressed TUBB. The infiltration levels of two subtypes between the tumor and paired adjacent tissues of all patients, MSC1 patients and MSC2 patients were further analyzed, respectively. SELENOP+M4 showed preferential enrichment in normal tissues. While considering the MSC subtypes, we found a significant lower infiltration level of SELENOP+M4 in MCS2, while TUBB+Mcyl showed the opposite trends ( Figure 2E). The infiltration levels of other cell subtypes in three states are shown in Figure S3.

Association of predicted MSC subtype with prognostic impact in TCGA and two independent LUAD cohorts
To predict the MSC subtype for bulk RNA-seq or microarray data, we determined the most informative genes and constructed a resulting 14-gene signature (i. e. ) ( Table S2).
The TCGA LUAD cohort were randomly divided into a training set (n = 292) and a test set (n = 193). The LASSO Cox regression model with 20-fold cross validation was performed to train the model in the training set. We then assessed the model performance in the test set.
According to the , LUAD patients were well classified into MSC1 and MSC2 subtypes.
The AUC of the ROC curve achieved 0.91 and 0.89 in the training and test sets respectively, indicating a strong prediction ability to stratify the patients ( Figure 3A).
The prediction capability of 14-gene signature was executed to further examine the prognostic significance in two independent LUAD microarray cohorts (GEO72094 and GEO31210). Similarly, a poor prognostic impact of MSC1 compared with MSC2 was found in both cohorts (P-value < 0.0001 and P-value = 0.0012, respectively) ( Figures 3B and 3C).
We also observed that nearly twice the relapse rate had occurred in MSC2 compared with MSC1 in the GSE31210 dataset. This suggested that the myeloid cell distribution might be associated with cancer relapse ( Figure 3D).
To determine whether MSC subtype is an independent predictor of the prognosis, we performed multivariate analysis. In the TCGA LUAD and GSE72094 cohorts, MSC2 strongly predicted a shorter OS compared with MSC1, independent of known risk factors  Figure 3F and Figure S4A).

MSC subtype significantly associated with tumor stages of LUAD
We analyzed the distribution of in different tumor and TNM stages. The higher was associated with a higher tumor stage ( Figure 3E). We also observed a similar positive correlation in both T, M and N stages, suggesting the distribution of myeloid cells were potentially related to the clinical stage ( Figure S4B-D). To further investigate whether the relative presence of these myeloid cell subtypes was associated with tumor progression, we calculated the correlation between each myeloid cell subtype and cancer stage. Five cell types, TIMP1+M3, S100A8+N3, TUBB+Mcyl, LAMP3+DC3, TCL1A+pDC, all of which were enriched in MSC2, were positively associated with tumor progression (FDR < 0.05).

MSC Subtype associated with immunotherapy response
Myeloid cells have been reported to be associated with ICB [28]. In the TCGA LUAD cohort, we introduced a tumor immune dysfunction and exclusion (TIDE) algorithm to explore the relationship between the MSC subtype and ICB response [29]. TIDE is a computational framework for predicting the clinical response to ICB in patients. A low TIDE prediction score indicates that the patients would potentially exhibit a greater immune therapy response. We observed the TIDE score as significantly lower in MSC1, suggesting the MSC1 group is more likely to respond to ICB therapy (Wilcoxon rank-sum test, P-value < 0.05) ( Figure 4A). This association was verified in two independent cohorts using univariate analysis (Wilcoxon rank-sum test, P-value < 0.05) ( Figure 4B). In the other independent cohort (i.e. GSE126044), which includes the RNA-seq data and response states of 16 patients before anti-programmed cell death protein 1 (PD-1) treatment, we compared the TIDE and to evaluate the prediction performance of ICB response. All five responders of 16 patients were clearly identified as MSC1 ( Figure 4C). showed the best predictive power (AUC = 0.891) as compared to TIDE and PD-1 ( Figure 4D).

Identification of potential drug for MSC2 patients
Since MSC2 patients seem more unlikely to respond to ICB therapy and show a worse survival state compared with MSC1 patients, we next focused on identifying the potential drugs for patients of the MSC2 subtype. The upregulated DEGs in MSC2 patients are supposed to be the therapeutic target. Note that MSC subtypes are classified according to the marker genes of myeloid cells, leading us to query whether the DEGs represent the diversity of tumor microenvironment. Using a hypergeometric test, we found the myeloid cell markers were almost irrelevant with DEGs between MSC1 and MSC2 patients ( Figure S5A). That suggests that the DEGs might reflect differences in tumor cell state.
We screened the data of lung adenocarcinoma cell lines in Project Achilles (https://depmap.org/portal/). The genes with a CERES lower than -0.5 in half of the lung adenocarcinoma cell lines were retained. We then intersected the DEGs which were upregulated in MSC2 patients with survival-related genes and obtained 29 genes which represent the targets for ICB therapy in MSC2 patients ( Figures S5B-C).
Eight of 29 genes were related to the cell cycle, suggesting the tumor cells in MSC2 patients might be in a relatively strong state of cellular proliferation. The remaining genes were involved in a wide range of cancer-related pathways, such as DNA replication, Ras signaling, and mTOR signaling ( Figure S5C). Twenty-eight of 29 genes have been reported to be related with LUAD [30][31][32][33]. However, TOPBP1 interacting checkpoint and replication regulator (TICRR), had only been previously reported to be important in DNA replication [34].
In our study, TICRR seems to be a promising therapeutic target for LUAD, especially in MSC2 patients.
Since the potential drug target gene set was obtained, we employed LINCS L1000 dataset to identify potential drugs. We focused on four LUAD cell lines, DV90, SKLU1, NCIH2073, and NCIH596. Testing was conducted of 361 drugs on the four cell lines, with a total of 1498 gene expression profiles extracted. After computing the robust z-scores for each profile relative to control, we ranked the gene based on the expression levels and performed GSEA analysis. We totally identified 129 drugs showed potential inhibition effect for at least a particular cell line (FDR < 0.05, Figure 4E). To obtain more reliable drugs, we selected the drugs which showed significant suppression effects on all four cell lines and discovered 10 drug candidates ( Figure 4E).

Identification of new myeloid cell subtypes related with ICB response
To further understand which myeloid cell subtype contributed to ICB response, we further investigated the fractions of infiltrated myeloid cells in GSE126044 dataset by applying CIBERSORTx. We observed IFIT3+N5 and PPARG+M7 were enriched in responders, while LAMP3+DC3 was the only subtype enriched in Non-responders ( Figure   5A). CD1C+DC1 highly expressed CD1C, FCER1A, and CLEC10A, corresponding to conventional cDC2, while CLEC9A+DC2 highly expressed CLEC9A, BATF3 and CADM1, corresponding to conventional cDC1 [35,36] ( Figure 4B).
According to existing knowledge, there is no LAMP3+DC3 counterpart in the classic DC subsets, we thus compared transcript profiles among three DC subtypes. We observed that LAMP3+DC3 highly expressed an "activated" DC signature in line with the previous study [12] and showed a higher migration ability, according a gene signature derived from mouse tissue-migratory cDCs ( Figure 5B) [37]. We then compared the ratio of LAMP3+ DCs/Total DCs in GSE131907 dataset [18]. The result showed LAMP3+DCs were enriched in the lymph nodes in LUAD patients, which further demonstrates its migration ability ( Figure 5C).
Interestingly, GO analysis showed LAMP3+DC3 was associated with negative regulation of the immune system ( Figure S6A), which seems to explain its enrichment in Non-responders.
Altogether, LAMP3+DC3 might play an important role in immunosuppression, especially in T cell dysfunction, albeit more mature [12].
The degree of cytotoxic T cell infiltration (CTL) has been reported to influence ICB effectiveness and been used as a parameter of TIDE [29,38]. We used the average expression of PRF1, GZMA, GZMB, CD8A and CD8B to estimate the CTL levels and examined the correlation between and CTL levels. Interestingly, a significant but moderate negative correlation was found (r=-0.26, P-value=9e-9), suggesting myeloid cells might affect the CTL level ( Figure 5D). It could be considered that the genes with high correlation with CTL level not only exist as a biomarkers but also as a clue of potential useful cell subtype. Notably, the unique marker gene of IFIT3+N5 was enriched in the gene set which highly correlated with CTL level ( Figure 5E). Nine of the top 20 highly correlated genes were unique markers of IFIT3+N5. IFIT3+N5 was found to be involved in the viral defense response, response to INF-gamma, and in the type I interferon signaling pathway ( Figure 5F). KEGG pathway analysis showed IFIT3+N5 was strongly associated with the NOD-like receptor signaling pathway and NF-kappa B signaling pathway ( Figure 5F). We noted that three guanylate-binding family proteins (GBP5, GBP4 and GBP1) in the top 20 correlated genes, had been identified to be linked to the inflammasome activation [39,40], and were also highly correlated with CTL level (0.876, 0.818, and 0.806). Altogether, IFIT3+N5 might play a role in activate the CD8+ cytotoxic cell response in LUAD, which act, in part, through inflammasome activation.

TIMP1+M3 macrophages recruit S100A8+ neutrophils via CXCL5-CXCR2 axes to promote LUAD progression
We used predefined ligand-receptor pairs [25] to examine the interactions between the myeloid cells. In terms of cell communications, we observed that macrophages showed higher proportions than other cell types (51% in Ligand; 45% in Receptor), suggesting macrophages may act as a hub for other myeloid cells ( Figure 6A).
We then focused on the subtypes enriched in either MSC1 or MSC2 types. TIMP1+M3 was the unique subtype enriched in MSC2, suggesting its potential ability to promote tumor progression. CXCL9+M2, the subtype also enriched in tumor tissues, showed an opposite enrichment pattern. By further comparing the RNA expression profiles between two TAM subtypes (i.e. TIMP1+M3 and CXCL9+M2), we observed that CXCL9+M2 exhibited a high expression of C1Q family genes, apolipoprotein family genes and antigen presentation related genes ( Figure 6C). In contrast, TIMP1+M3 showed specific expression of TIMP1, VCAN, and CXCL5 ( Figure 6C). GO analysis revealed a strong enrichment of complement activation, immune response, and antigen processing and presentation pathway in CXCL9+M2, while positive regulation of angiogenesis and neutrophil chemotaxis showed significant enrichment in TIMP1+M3 ( Figure 6D). Meanwhile, we found TIMP1+M3 showed a functional relationship with neutrophil, as indicated by GO annotation. By the cell-cell interaction analysis, TIMP1+M3 was predicted to interact with neutrophils (IFITM2+N2 and S100A8+N3) via CXCL5-CXCR2 axes, suggesting TIMP3+M3 attracted the neutrophil via the chemokine ( Figure 6E). S100A8+N3, which was attracted by TIMP1+M3 via the CXCL5-CXCR2 axes, was identified as a risk factor and showed moderate positive correlation with TIMP3+M3, whilst its correlation with IFITM2+N2 was negative ( Figures   S2 and S7B). We also observed the higher value CXCL5 × CXCR2 was associated with an unfavorable OS, whereas no such association was observed for either CXCL5 or CXCR2 ( Figure 6F). Interestingly, TIMP1+M3 and S100A8+N3 were positively correlated with N stage, suggesting their important role in lymph node metastasis ( Figure S4C).
We also investigated other macrophage subtypes. SELENOP+M4 showed high expressions of CCL4L2, CCL3L3, CCL3, and CLL4. GO analysis revealed leukocyte chemotaxis and positive regulation of cytokine production pathway as enriched in SELENOP+M4 ( Figure S7C). SELENOP+M4 also highly expressed antigen presentation and T cell activation related genes, suggesting it might play an important anti-cancer role and involvement in immune activation ( Figure 6B). PPARG+M7 was identified as resident alveolar macrophage with high expressions of PPARG, FABP4, INHBA and ALDH2. We noted that PPARG+M7 was enriched in ICB responders, suggesting it may also play a key role in regulating anti-tumor immunity as a tissue-specific macrophage.

Discussion
We calculated the infiltration of TIM subtypes in the TCGA LUAD cohort by combining the scRNA-seq and deconvolution algorithm. According to intra-tumor TIM heterogeneity, patients were stratified into two groups, MSC1 and MSC2. We found that MSC subtypes were strongly associated with OS and ICB responses. Specific TIM subtypes showed particular functions in tumor progression. In discussing this work, we focused on the results that have promising applications and those which are closely related to clinical treatment.
We proved the MSC subtypes represent the states which either may or may not respond to ICB therapy in multiple datasets. In particular, the MSC subtype was useful for the estimation of differences in TIMs infiltration states. We validated the effectiveness of MSC subtype by comparing it with the TIDE Score, which mainly considered the function of cytotoxic T cells as predictive of ICB response. The TIDE Score shows significant differences within two MSC subtypes in TCGA and in the two GEO datasets. Since both lymphocytes and myeloid cells have been reported to be related with ICB response [41], we believe that MSC subtypes and TIDE Score reflected different aspects of ICB responses of patients and that MSC subtypes could be jointly used with TIDE score to achieve a better estimations in a clinical context.
In our paper, three TIM subtypes were identified as ICB response related, including LAMP3+DC3, IFIT3+N5, and PPARG+M7. LAMP3+DC3 was enriched in ICB non-responders and was identified as a more mature DC subtype. Compared with other two DC subtypes, CD1C+DC1 and CLEC9A+DC2, the negative regulation of immune system process pathways were enriched in LAMP3+DC3. CCR7 is necessary for the migration of tumor-infiltrating DCs into tumor-draining lymph nodes [42]. LAMP3+DC3 highly expressed CCR7 and showed the strongest migration ability, indicting LAMP3+DC3 might migrate to the lymph node and suppress immune activation. Similar DCs were identified in the single cell study of hepatocellular carcinomas and were described to be related with T cell dysfunction by interacting with T lymphocytes [12]. Zhang et al. also suggested that LAMP3+DCs in tumors might originate from cDC1 and cDC2. Thus, LAMP3+ DCs may not only be a predictive factor of ICB response, but also could be considered as a new target for immunotherapy.
As multiple studies have demonstrated that tumor infiltrating neutrophils are related to cytotoxic T cells in various ways [43,44], we confirmed that our identified IFIT3+N5, a subtype of neutrophil, was enriched in ICB responders and showed positive correlation with CTL. Functional annotation indicated that IFIT3+N5 might activate CD8+ T lymphocytes, partly via inflammasome activation. According to a previous study, IFIT3+N5 corresponds to a group of mature neutrophils which are expanded in virus infected tissues [45]. Considering that neutrophils might be converted into different phenotypes, either anti-tumoral or pro-tumoral, the clarification of how the precursor cells are changed to IFIT3+N5 will be an important consideration for future studies.
It has been reported that CXCR2+ neutrophils are recruited by CXCL5 in tumor tissues to promote tumor progression in liver and non-small cell lung cancers [46,47]. However, most of these studies used cell lines, tissue sections and mouse models, which made it difficult to identify the specific cell subtypes involved. In this article, we clearly identified that TIMP3+M3 recruited S100A8+N3 via CXCL5-CXCR2 axes. TIMP3+M3 and S100A8+N3 were both identified as protumoral cell types and related with lymph node metastasis, suggesting those cells might promote the tumor progression in synergy. CXCR2 and CXCR4 were seen as required when neutrophils egress from the bone marrow and are retained in the lungs [48]. Here, we noticed a repulsive expression pattern between CXCR2 and CXCR4 in the neutrophils ( Figure S6D). Thus, a blockade of CXCR2 might lead to decreasing infiltration of S100A8+N3, which might partly explain the high performance of CXCR2 antagonists [49].

Conclusions
We generated the landscape of myeloid cells in LUAD and stratified the patients into two

Declarations Ethics approval and consent to participate
Not applicable

Consent for publication
Not applicable

List of Supporting Information:
Additional supporting information may be found online in the Supporting Information section at the end of the article.