Dissecting tumor antigens and immune subtypes for mRNA vaccine development in breast cancer

Cancer mRNA vaccines are a promising strategy and a hot topic in cancer immunotherapy. However, mRNA vaccines for breast cancer (BRCA) remain undeveloped. This study aimed to identify potential tumor antigens for mRNA vaccine development and a population with BRCA suitable for vaccination. Gene expression profiles and the clinical information of the TCGA-BRCA (the Cancer Genome Atlas Breast Cancer) and METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) cohorts were downloaded from the TCGA and cBioPortal databases, respectively. cBioPortal was used to identify mutant genes. DEG (differentially expressed gene) identification and survival analysis were performed with the GEPIA2 tool. ssGSEA (single-sample gene set enrichment analysis) was applied to estimate abundances of 28 immune cells for each sample. An unsupervised consensus clustering algorithm was used to identify ISs (immune subtypes). A graph learning-based dimensionality reduction analysis algorithm was utilized to construct an immune landscape. WGCNA (weighted correlation network analysis) was performed to identify immune gene modules. Four potential tumor antigens, i.e., SLC7A5, CHPF, CCNE1, and CENPW, associated with poor prognosis and APCs (antigen-presenting cells) among overexpressed and mutated genes were identified in BRCA. Two ISs (IS1-2) characterized by distinct clinical, immune cell infiltration, and molecular features were observed in both the TCGA-BRCA and METABRIC cohorts. BRCA patients with IS2 tumors related to poor prognosis had an immune "hot" phenotype, while those patients with IS1 tumors related to superior prognosis had an immune "cold" phenotype. Distinct IS tumors were observed in different ICD (immunogenic cell death modulator) and ICP (immune checkpoint) expression profiles. The immune landscape showed an immune distribution in BRCA patients. Additionally, we identified 2 immune gene modules with different biological functions. SLC7A5, CHPF, CCNE1, and CENPW are the potential tumor antigens for mRNA vaccine development with BRCA. Patients with IS2 tumors are a suitable population for mRNA vaccination. This study provides a new insight into mRNA vaccine development, population selection for vaccination, and prognosis prediction.


Introduction
Breast cancer (BRCA) has become the first leading cause of cancer incidence with more than 2.3 million new cases, accounting for 11.7% of global cancer cases in 2020 [1].Traditional treatments, including surgical resection, chemotherapy, and radiotherapy have greatly prolonged the survival time of early-stage BRCA patients in recent decades.However, the 5-year survival rate of BRCA patients with distant metastasis remains at 30% (https:// seer.cancer.gov/ statf acts/ html/ breast.html).Hence, there is an urgent need to find new strategies for improving the therapeutic condition of BRCA.
Cancer immunotherapies eliminate cancer by boosting the host's antitumor response and reshaping the tumor microenvironment (TME).Among them, cancer vaccines have attracted much attention from oncologists owing to prevention and safety.Cancer vaccines are mainly classified into four categories including tumor cell, dendritic cell, DNA, and RNA vaccines, according to their antigen form [2,3]. Vaccines carrying tumor-associated antigens (TAAs) or tumor-specific antigens (TSAs) that can be recognized, processed, and presented by antigen-presenting cells (APCs) before activating autologous immune cells to induce antitumor effects [4,5].Coupled with the clinical successes of mRNA vaccines against the coronavirus disease-2019 (COVID-19) pandemic, cancer mRNA vaccines have again become a hot topic in the cancer therapy field.Compared to other types of cancer vaccines, mRNA vaccines have following major advantages: (1) mRNA vaccines do not provoke insertional mutations because they cannot integrate into the genome [3].(2) They have a short half-life in vivo since they can be degraded by cellular RNases, which means a favorable safety profile [6].(3) They can be manufactured cost-effectively and rapidly under a standardized process, which implies a strong responsiveness to public health emergencies [7].To date, over 50 mRNA vaccines have been used to combat blood cancers, melanoma, glioblastoma, and prostate cancer in clinical trials (https:// clini caltr ials.gov/).However, an mRNA vaccine against BRCA has not been developed, and the population suitable for vaccination remains unclear.
This study intended to identify potential tumor antigens for mRNA vaccine development and to construct an immune landscape for selection of a suitable population for vaccination.As shown in the Fig. 1, we identified 4 potential tumor antigens, i.e., SLC7A5, CHPF, CCNE1, and CENPW, associated with poor prognosis and APCs among overexpressed and mutated genes in BRCA.Two immune subtypes (ISs) and two immune gene modules were recognized through the consensus clustering algorithm and weighted correlation network analysis (WGCNA), respectively.Each IS was observed to have distinct clinical, immune cell infiltration, and molecular characteristics.We also constructed the immune landscape for BRCA by a graph learning-based dimensionality reduction analysis to reveal the immune-related gene distribution in individual patients.Overall, our study revealed 4 potential tumor antigens for mRNA vaccine development and found that BRCA patients with IS2 tumors are suitable for vaccination.

Online tool analyses
To identify tumor antigens and their relationships with OS and RFS, we performed mutation analysis, DEG (differentially expressed gene) analysis, and survival analysis through the online tools cBioPortal tool (cBio Cancer Genomics Portal website, https:// www.cbiop ortal.org/) [8] and GEPIA 2 [9] (Gene Expression Profiling Interactive Analysis version 2, http:// gepia2.cancer-pku.cn/# index), respectively.Data visualization was also performed using those online tools.

cBioPortal tool
In the present study, the cBioPortal tool was used to visualize genetic alterations and to screen mutant genes.The cBioPortal tool integrates multidimensional cancer genomics datasets from multiple cohorts, including the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) and TCGA-BRCA (the Cancer Genome Atlas Breast Cancer) cohorts.automatically mapped 3367 tumor samples with mutation information for gene mutation analysis and 1068 tumor samples with fraction gene altered (FGA) information for FGA analysis.

GEPIA2 tool
We used the "differential genes" module of the GEPIA2 tool to identify DEGs.The GEPIA2 tool included the RNA-seq data with survival information for 1085 tumor samples and 112 matched normal samples from the TCGA-BRCA cohort.The "differential genes" module utilizes the analysis of variance (ANOVA) with parameters set to |log 2 FC| value > 1 and q < 0.01 to identify DEGs.Additionally, the "survival analysis" module of GEPIA2 was used to evaluate the prognostic value of the DEGs.This module uses Kaplan-Meier curves with log-rank tests and median cutoff DEG profiles to evaluate the relationships between DEGs and overall survival (OS) as well as relapse-free survival (RFS).A P < 0.05 was regarded as statistically significant.

Offline analyses
To identify the population suitable for mRNA vaccination, we used the R software (version 4.0.2,https:// mirro rs.tuna.tsing hua.edu.cn/ CRAN/) to perform data analyses and visualization using the downloaded data offline.

Data accession and processing
RNA-seq data and clinical information of BRCA patients in the TCGA-BRCA cohort and METABRIC cohort were downloaded from the TCGA (https:// portal.gdc.cancer.gov) database and the cBioPortal website, respectively [10].For further analysis, we excluded samples with absent or vague survival information.A total of 1110 TCGA-BRCA and 1904 METABRIC samples were included in this study (the clinical characteristics of the cohorts are shown in Additional file 2: Text).The batch effect before merging different expression matrices was removed by the comBat function with the SVA R package [11] (for more details, see Additional file 2: Fig. S1), and the merged cohort was termed the meta-BRCA cohort.The somatic mutation information of TCGA-BRCA samples detected by WES (whole-exon sequence) was downloaded from TCGA database for TMB (tumor mutation burden) calculation.The somatic mutation information was preprocessed by TCGA database with the VarScan 2 method.

Immune cell abundance estimation
We used single-sample gene set enrichment analysis (ssGSEA) of the GSVA R package to estimate the relative immune cell abundance of each sample with the meta-BRCA cohort [12] based on its mRNA expression profiles according to a given immune cell gene set [13].In this study, we systematically retrieved the literature and adopted an immune cell gene set proposed by Beibei Ru et al. [14].This gene set consists of 742 genes representing 28 immune cells (for more details, see Additional file 1: File S1).The estimate R package was used to evaluate tumor purity [15].Pearson's correlation analysis was used to examine the relationships between mRNA expression profiles and B-cell, APC (antigenpresenting cell) infiltration, and tumor purity in the meta-BRCA cohort.A P < 0.05 was regarded as statistically significant.

Identification of ISs
First, we downloaded immune-related genes from the Immunology Database and Analysis Portal (ImmPort, https:// www.immpo rt.org/ shared/ genel ists) database.These immune-related genes are involved in various immune processes, such as antigen presentation, production of cytokines, activation of interleukin receptors, and so on.Through R software, we extracted 1203 immune-related genes of the gene expression profiles of the meta-BRCA cohort (for more details, see Additional file 1: File S2).Based on those 1203 immune-related genes, we performed unsupervised consensus clustering to identify ISs (immune subtypes) by the ConsensusClusterPlus R package [16].A total of 1000 bootstraps with 80% item resampling in each bootstrap were used to ensure classification stability.A consensus heatmap and relative change in area under a cumulative distribution function were used to determine the optimal number of clusters.

Somatic genetic variation analysis and CYT score calculation
To determine the TMB, the maftools R package [17] was used to count the total number of non-synonymous mutations in the TCGA-BRCA cohort and the total number of nonsynonymous mutations in the METABRIC were directly downloaded from the cBioProtal website.After we merged the TMB information of those 2 cohorts, the Wilcoxon test was used to compare the mutation count and TMB between distinct ISs, and a P < 0.05 was considered to be statistically significant.We utilized the oncoplot function of the maftools R package to visualize the top 10 highest mutated genes between distinct ISs.This process was performed using the TCGA-BRCA cohort because of a lack of detailed mutation information for a single gene on the cBioProtal website.To detect the magnitude of the antitumor response with distinct ISs, a cytolytic activity score (CYT) was calculated by the geometrical mean of PRF1 and GZMA mRNA expression profiles for the meta-BRCA cohort [18].The data were compared by Student's t test, and P < 0.05 was considered statistically significant.

Construction of the BRCA immune landscape
To further uncover the IS distribution with individual patients, we performed graph learning-based dimensionality reduction analysis through discriminative dimensionality reduction with tree (DDRTree) based on the 1203 immune-related gene expression profiles in the meta-BRCA cohort.The plot cell trajectory function of the monocle R package [19] was used to visualize the immune landscape.We also extended this analysis to reveal the intrinsic IS distribution in individual patients.Similarly, the DDRTree and plot cell trajectory functions were used to perform dimensionality reduction and to visualize the immune landscape, respectively.

Construction and GO functional annotation of IS-associated gene modules
To recognize IS-associated gene modules, we performed WGCNA based on the 1203 immune-related gene expression profiles using the meta-BRCA cohort via the WGCNA R package [20].We first converted the representation matrix to an adjacency matrix and then to a topological matrix.We used a stepwise method with a minimum of 30 genes for each network following a standard dynamic shear tree to construct a weighted coexpression network.The soft threshold power was set to 2 using the scale-free topology criterion to develop a weighted adjacency matrix.The coexpression modules were recognized by a bottom-up algorithm with a dynamic tree-cut method.Close modules were merged with a standard of height = 0.25, deep split = 2, and min module size = 30.Module eigengenes (MEs) were calculated to quantify modular similarity.Moreover, Gene Ontology (GO) functional annotations were used to explore the genetic functions of the gene modules via the clusterProfiler R package [21], including biological processes (BP) and molecular functions (MF), and cellular components (CC) analyses.A Benjamini-Hochberg (BH) adjusted P < 0.05 was regarded as statistically significant.

Statistical analysis
In the present study, P < 0.05 was considered to be statistically significant.A log-rank test was utilized to compare OS and RFS with specific groups by the survmier R package [22].The chi-square test was used to examine categorical data.The Wilcoxon test was used to compare the nonnormally distributed data.Comparison of the normally distributed data was performed by Student's t test.Pearson's correlation analysis was performed to compare the relationship between two continuous variables.

Identification of potential tumor antigens in BRCA
To identify potential antigens of BRCA, we first performed a DEG analysis between the tumor and normal tissues.A total of 1418 upregulated genes among 3556 DEGs were identified (for more details, see Additional file 1: File S3). Figure 2a shows a chromosomal distribution of those DEGs.Second, the mutational analysis screened 16,494 mutated genes that potentially encode TAAs (for more details, see Additional file 1: File S4). Figure 2b and c indicates that most BRCA patients had low mutation counts (number of mutational events per case) and FGA (% of copy number altered chromosome regions out of measured regions per case), suggesting low immunogenicity for BRCA. Figure 2d, e show the top 10 most frequently mutated genes in terms of mutation counts and altered genome fractions, respectively.ACR , ADM2, CHKB, CHKB-CPT1B, CHKB-DT, CPT1B, DENND6B, DNAJB6, DNAJB6-AS1, and HDAC10 were the top 10 most frequently mutated genes in terms of mutation count (Fig. 2d).TP53, SCFD2, ARAP3, PCDHB12, HK3, OC90, PIK3CA, LINC02584, LPO, and ANO1 were the top 10 most frequently mutated genes in the altered genome fraction analysis (Fig. 2e).In total, we identified 993 upregulated and frequently mutated tumor-associated genes (for more details, see Additional file 1: File S5).

Relationships between ISs and TMB, CYT score, and potential tumor antigens
Higher TMB and CYT scores are related to stronger anticancer immunity [18].Therefore, we compared TMB and CYT scores for each patient in the meta-BRCA cohort.As depicted in Fig. 6a, b, both mutation counts (Wilcoxon test; P < 0.0001, Fig. 6a) and the TMB (Wilcoxon test; P < 0.0001, Fig. 6b) of patients with IS2 tumors were markedly higher than those of patients with IS1 tumors.Interestingly, BRCA patients with IS2 tumors also had significantly higher CYT scores than those patients with IS1 tumors (Student's t test; P < 0.0001, Fig. 6c). Figure 6d shows the top 10 most frequently mutated genes in each IS.In addition, the mRNA expression profiles of these 4 potential tumor antigens were higher in patients with IS2 tumors than in patients with IS1 tumors (Student's t test; Fig. 6e).These findings indicate that the ISs we identified can be used to predict the mutation status and cytolytic activity of BRCA.

Immune cellular characteristics of ISs
Tumor immune cell infiltration status determines response to mRNA vaccines to a large extent.Therefore, we further characterized the tumor immune cell components of the IS1-2 tumors by estimating the relative abundances of 28 immune cell types in both the TCGA-BRCA and METABRIC cohorts.Figure 8a comprehensively depicts the whole landscape of 28 immune cell relative abundances of IS1-2 tumors in both cohorts.IS2 tumors were observed to have a greater immune cellular distribution than IS1 tumors in both cohorts.In the TCGA-BRCA cohort, almost all of those 28 immune cells, except for CD56 bright natural killer cells, eosinophils, mast cells, memory B cells, and plasmacytoid dendritic cells, were significantly differentially infiltrated in distinct ISs.Only the central memory CD8 cells more sparsely infiltrated the IS2 tumors than the IS1 tumors among these 28 immune cells (Wilcoxon test; Fig. 8b).In the METABRIC cohort, only eosinophils, mast cells, memory B cells, and plasmacytoid dendritic cells remained unremarkably differentially infiltrated in distinct ISs (Wilcoxon test; Fig. 8c).In line with the TCGA-BRCA cohort results, only CD56 bright natural killer cells and central memory CD8 cells were more sparsely infiltrated the IS2 tumors than IS1 tumors.Overall, the IS2 tumors were distributed with more immune cells characterizing an immune "hot" phenotype.In contrast, IS1 tumors infiltrated sparse immune cells, indicating an immune "cold" phenotype.

The immune landscape of BRCA
To further visualize the immune distribution of individual patients, we constructed an immune landscape of BRCA through graph learning-based dimensionality reduction analysis.Patients with IS1 tumors and IS2 tumors distributed in opposite directions on the immune landscape (Fig. 9a).We also observed that the PC (principal component) 1 correlated negatively with all immune cells, especially type 1 T helper cells, T follicular helper cells, and natural killer T cells but that PC2 was only negatively linked to 9 immune cells, including type 17 T helper cells, neutrophils, monocytes, immature B cells, activated dendritic cells, CD56 dim natural killer cells, activated

CD8 T cells, activated CD4 cells, and activated B cells. In addition, PC2 was positively related to type 1 helper cells, T follicular helper cells, regulator T cells, plasmacytoid dendritic cells, natural killer cells, memory B cells, mast cells, macrophages, immature dendritic cells, eosinophil cells, effector memory CD8 T cells, central memory CD4 T cells, central memory CD8 T cells, and CD56 bright natural killer cells (Pear-
son's correlation analysis; Fig. 9b).Furthermore, we found the intracluster heterogeneity of ISs in the immune landscape.We further divided IS1 into IS1A-B and divided IS2 into IS2A-B according to the patient's location on the immune landscape (Fig. 9c).These intraclusters also exhibited distinct immune cellular infiltrating characteristics.Almost all immune cells, except for CD56 dim natural killer cells, showed remarkably different abundances between IS1A and IS1B tumors.Of note, the IS1A tumors were relatively infiltrated with sparse immune cells, except for monocytes and neutrophils (Wilcoxon test; Fig. 9d).Similarly, most immune cells, except for CD56 bright natural killer cells and central memory CD8 T cells, significantly differentially infiltrated the IS2A and IS2B tumors.All immune cells infiltrated the IS2B tumors relatively sparsely compared with IS2A tumors (Wilcoxon test; Fig. 9e).Moreover, we also observed prolonged survival times in Cluster 3 patients compared with Cluster 1 and Cluster 2 patients in the immune landscape (log-rank test; P = 0.047, Fig. 9f, g).These findings suggested that the location of patients in the immune landscape can predict their prognosis.Overall, the immune landscape can not only effectively reflect the immune cell infiltration status of BRCA patients but can also predict their prognosis.

Construction IS-associated gene modules of BRCA
IS-associated gene modules were constructed by the WGCNA algorithm with a soft threshold of 2 for a scale-free network (Fig. 10a, b).We obtained 2 coexpression modules with a total of 458 genes, of which 132 clustered into the blue module and 326 into the turquoise module.The remaining 745 genes are marked in gray without being clustered into any module according to the documentation of the WGCNA R package which was not displayed in the graph in Fig. 10c (for more details, see Additional file 1: File S6).We also analyzed the IS distribution in the module eigengenes of those 2 modules.The module eigengenes of the IS1 tumor were significantly higher than those of the IS2 tumor in the blue module, while this trend was opposite in the turquoise module (Wilcoxon test; P < 0.0001, Fig. 10d).Thus, we hypothesized that these 2 modules may be related to distinct biological functions.To further validate this issue, we performed GO functional annotations for these 2 modules.We found that the blue module refers more to signaling transformations, for example, positive regulation of phosphatidylinositol 3-kinase signaling in GO BP analysis and transforming growth factor beta-activated receptor activity in GO MF analysis (Fig. 11a).The turquoise module was related to immune related functions, such as the positive regulation of T-cell proliferation in GO BP analysis and MHC class II protein complex in GO CC analysis (Fig. 11b).Moreover, the turquoise module (Pearson's correlation analysis; correlation coefficient = − 0.92, P = 2.2e−16, Fig. 11d) was more closely related to the immune landscape than the blue module (Pearson's correlation analysis; correlation coefficient = − 0.34, P = 2.2e−16, Fig. 11c).In addition, prognostic analyses of those 2 modules revealed that only the turquoise module was related to OS (log-rank test; P = 0.22, Fig. 11e; P = 0.011, Fig. 11f ).The therapeutic potential of mRNA vaccines in BRCA patients was largely determined by the immune functions of highly expressed genes.Therefore, an mRNA vaccine might be suitable for BRCA

Discussion
BRCA is one of the most commonly diagnosed malignancies, with a high level of biological and prognostic heterogeneity.Although various treatments can prolong survival time, the prognoses of relapsing and metastatic patients remain unsatisfactory.Proteins encoded by mutated genes differ from wild-type proteins and can be recognized by immune cells to remove cancer cells and act as tumor antigens [25].In general, the immune response can lead to reduction or loss of those tumor antigens and induce immune escape [26].Moreover, the immunogenicity of those tumor antigens can be reduced by coated substances, such as salivary mucopolysaccharides [27].Therefore, amplification of tumor antigens with exogenous mRNA vaccines to reactivate the host antitumor response is a good strategy.
In this study, we identified 4 potential tumor antigens, i.e., SLC7A5, CHPF, CCNE1, and CENPW, for mRNA vaccine development in BRCA.They were not only negatively related to survival time and tumor purity but also positively related to APCs and B-cell infiltration.Although no experiments have directly indicated that these molecules can serve as tumor antigens in vitro or in vivo, an increasing number of studies have reported their vital roles in BRCA progression, prognosis, and treatment.For example, SLC7A5 is an amino acid transporter for the uptake of leucine in cells, which is critical for metabolic activation and cellular functions [28].Engineering chimeric antigen receptor (CAR)-T cells to overexpress SLC7A5 can enhance CAR-T -cell recognition of tumor cells [29].CHPF promotes breast carcinoma cell proliferation, invasion, and migration via upstream TGF-β1/SMAD3 and JNK axis activation [30].CHPF can also promote malignancy in BRCA by reshaping the TME [31].CCNE1 is an important cyclin protein that is related to worse clinical outcomes in BRCA patients [32].Targeting CCNE1, a cyclin-dependent kinase (CDK) 4/6 inhibitor (palbociclib) was developed to block cell cycle progression to reduce tumor cell proliferation in BRCA [33].CENPW is associated with worse prognosis in BRCA, and knocking down CENPW can inhibit the proliferation and migration of BRCA cells [34].Liu et al. [35] also identified three potential tumor antigens, i.e., CD74, IRF1, and PSME2, for mRNA vaccine development in BRCA.Contrary to our findings, all of those tumor antigens were related to better prognosis.The immune response triggered by those tumor antigens may result in their downregulation to the original levels and in turn shorten the survival time of BRCA patients, which is the opposite of the aim of vaccination.Therefore, the 4 worse prognostic potential tumor antigens we identified can provide a credible guide to mRNA vaccine development.
Since a limited portion of BRCA patients with specific ISs can benefit from mRNA vaccines [36,37], we divided BRCA patients into 2 ISs via an unsupervised consensus clustering algorithm based on immune-related genes for segmenting the appropriate population for vaccination.BRCA patients with IS1-2 in both the TCGA-BRCA and METABRIC cohorts were observed to have different prognostic, clinical, molecular, and immune cell infiltrating characteristics, which indicates a different clinical response to a mRNA vaccine.In the present study, BRCA patients with IS1 tumors characterized by sparse immune cell infiltration had better prognosis than those with IS2 tumors characterized by abundant immune cell infiltration, in line with our previous study [38].Therefore, the dominance of the immune-suppressive environment or stimulatory environment is the decisive factor for prognosis [39].IS2 tumors still had higher TMB and mutation numbers, which implies a greater heterogeneity and was easier recognition by immune cells.IS2 tumors were characterized by more infrequent TP53 and less infrequent PIK3CA mutations than IS1 tumors.BRCA patients with mutant TP53 and PIK3CA have poor prognosis, consistent with our study [40][41][42].Additionally, a higher CYT score for the IS2 tumors implies that more tumor cells were attacked by immune cells.IS2 tumors also displayed higher expression of ICPs and ICDs than IS1 tumors in both the TCGA-BRCA and METABRIC cohorts, which suggests that patients with IS2 tumors will benefit more from both immune checkpoint inhibitor and mRNA vaccine treatments.Therefore, we propose that BRCA patients with IS2 tumors are more suitable for mRNA vaccination than those with IS1 tumors.Patients with IS2 tumors who receive treatments combing immune checkpoint inhibitors with mRNA vaccination may obtain better curative effects.
To date, many studies have identified tumor categories based on various aspects of tumor biology.BRCA is commonly classified into three main subtypes according to the status of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2), guiding the choice of medical treatments, including endocrine and anti-HER2 therapies [43].However, those categories do not guide the mRNA vaccination since they cannot represent the complex TME in BRCA.Therefore, we did not investigate the distribution of ISs in those categories.Thorsson et al. divided 33 cancer types into six ISs (C1-6) based on the TME and prognosis [44].We found that the major C1 (wound healing) and C2 (INF-γ dominant) categories characterized by abundant immune cell infiltration and favorable prognoses were clustered as IS2 tumors but that the C4 (lymphocyte depleted) and C6 (TGF-β dominant) categories with sparse immune cell infiltration and worse prognoses were clustered as IS1 tumors.Contrary to their prognostic roles, BRCA patients with IS1 tumors tended to have better prognosis than patients with IS2 tumors.Therefore, our division of the TME is different from previous classifications and provides a useful complement to classification of the TME.To further reveal the intracluster heterogeneity of ISs, we performed graph learning-based dimensionality reduction of immune-related gene profiles according to previous studies [45,46].We observed that patients distributed in three different directions on the graph have different prognoses.Comparison of intraclusters revealed that almost all immune cells infiltrated IS1B and IS2A more abundantly than IS1A and IS2B, which indicates that BRCA patients accepting mRNA vaccine treatments with distinct intraclusters of ISs may still have different responses and clinical outcomes.Thus, a more precise methodology for defining the TME is still needed in future studies.
Our study also has several limitations.First, we only identified the potential tumor antigens based on RNA-seq data without validation of their response because no clinical data for mRNA vaccine use in BRCA have been published thus far.Second, further applications of ISs in clinical practice are difficult since the gene expression profile for a patient should be used.Further in vivo and vitro studies are needed in this field.

Conclusions
In conclusion, SLC7A5, CHPF, CCNE1, and CENPW are potential tumor antigens for mRNA vaccine development.BRCA patients with IS2 tumors may benefit from vaccination.Our study provides a foundation for mRNA vaccine development, population selection for vaccination, and prognosis prediction.

Fig. 1
Fig. 1 Workflow of this study

Fig. 2
Fig. 2 Identification of potential tumor antigens in BRCA. a Chromosomal distribution of differentially expressed genes in BRCA.Sample counts in mutation count (number of mutational events per case) groups (b) and genome fraction altered (% of copy number altered chromosome regions out of measured regions per case) groups (c).Genes with the top 10 highest frequencies in mutation count groups (d) and in altered genome fraction groups (e).The colors of the bar represent the group terms and the heights of the bar represent the frequency of each group

Fig.
Fig.S2c), ZMYND10 (zinc finger MYND domain-containing protein 10; log-rank test; P = 0.049, Additional file 2: Fig.S2d), RSPH1 (radial spoke head 1 homolog; log-rank test; P = 0.021, Additional file 2: Fig.S2e), and TMEM119B (transmembrane protein 119; log-rank test; P = 0.012, Additional file 2: Fig.S2f).In the present study, proteins encoded by the 4 above poor-prognostic genes were considered to be potential tumor antigens for mRNA vaccine development.Furthermore, the expression profiles of CENPW, SLC7A5, CHPF, and CCNE1 were positively associated with activated B-cell and APCs (activated dendritic cell and macrophage) abundance, but negatively with tumor purity (Pearson's correlation analysis; Fig.4a-d).These findings indicate that these 4 potential tumor antigens (for more details, see Additional file 2: TableS1) may be directly processed and presented by APCs to T cells and recognized by B cells to trigger an antitumor response, reducing the tumor purity to prolong the survival time of BRCA patients.Hence, these 4 potential tumor antigens are promising candidates for developing mRNA vaccines against BRCA.

Fig. 3
Fig. 3 Identification of prognosis-related tumor antigens in BRCA. a In total, of 993 TAAs, 10 antigens were related to OS and RFS.Kaplan-Meier curves showing the relationships between OS and b SLC7A5, c CHPF, d CCNE1, and e CENPW expression

Fig. 4
Fig. 4 Identification of APCs and tumor purity-associated tumor antigens in BRCA.Correlations between a CENPW, b SLC7A5, c CHPF, and d CCNE1 expression and activated B-cells, activated dendritic cells, macrophage infiltration, and tumor purity

Fig. 5
Fig. 5 Identification of ISs in BRCA. a Tracking plot showing the stability of classification with subtype numbers.b Scree plot showing the relative change in the area under cumulative distribution function (CDF) curve with subtype numbers.c CDF plot showing the CDF with different subtype numbers.d Sample clustering heatmap with 2 divisions of the meta-BRCA cohort.(e) Kaplan-Meier curve with log-rank test showing OS of ISs in the meta-BRCA cohort.f Kaplan-Meier curve with log-rank test showing RFS of ISs in the METABRIC cohort.g Association of ISs with pathological M, T, N, S stage in the TCGA-BRCA cohort.NA not applicable, * p < 0.05, ** p < 0.01

Fig. 6
Fig. 6 Associations between ISs and the tumor mutation count, TMB, and CYT score.a The mutation number in IS1-IS2 BRCA.b The TMB in IS1-2 with BRCA.c The CYT score in IS1-2 with BRCA.d Top 10 most frequently mutated genes in IS1-2 BRCA.E Relationships between ISs and expression of 4 potential tumor antigens

Fig. 9
Fig. 9 Construction of the immune cell-infiltrating landscape of BRCA. a The immune cell-infiltrating landscape of BRCA.Each point represents a sample, and the specific colors represent specific ISs.The X-axis represents the principal component 1 (PC1), and the Y-axis represents the principal component 2 (PC2).b Association between 28 immune cell densities and PC1 and PC2.c Immune cell-infiltrating subset landscape of BRCA.d Comparisons of 28 immune cell densities between IS1A and IS1B.e Comparisons of 28 immune cell densities between IS2A and IS2B.f Immune landscape of samples from three extreme locations and g their prognoses.ns not significant, * p < 0.05, ** p < 0.01, **** p < 0.0001

Fig. 10
Fig. 10 Construction of immune gene coexpression modules of BRCA. a The scale-fit index for soft threshold powers.b The mean connectivity for soft threshold powers.c A dendrogram of 1203 immune genetic clusters based on a dissimilarity measure (1-TOM).d Comparisons of feature vectors with each module with BRCA

Fig. 11
Fig. 11 Identification of immune hub genes with immune gene modules in BRCA. a Dot plot showing GO functional annotations with the blue module.b Dot plot showing GO functional annotations with the turquoise module.c Correlation between the blue module feature vector and PC1 in the immune landscape.d Correlation between the turquoise module feature vector and PC1 in the immune landscape.e Kaplan-Meier curve showing the prognostic value of the blue module feature vector.f Kaplan-Meier curve showing the prognostic value of the turquoise module feature vector In the present study, we pitched on Breast Cancer (METABRIC, Nature 2012 & Nat Commun 2016; 2509 tumor samples with 548 matched normal samples) and Breast Invasive Carcinoma (TCGA, PanCancer Atlas; 1084 tumor samples) datasets in cBioPortal for analyses.The cBioPortal tool