Dissecting tumor antigens and immune subtypes for mRNA vaccine development in breast cancer
Journal of Big Data volume 10, Article number: 149 (2023)
Cancer mRNA vaccines are a promising strategy and a hot topic in cancer immunotherapy. However, mRNA vaccines for breast cancer (BRCA) remain undeveloped. This study aimed to identify potential tumor antigens for mRNA vaccine development and a population with BRCA suitable for vaccination.
Gene expression profiles and the clinical information of the TCGA-BRCA (the Cancer Genome Atlas Breast Cancer) and METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) cohorts were downloaded from the TCGA and cBioPortal databases, respectively. cBioPortal was used to identify mutant genes. DEG (differentially expressed gene) identification and survival analysis were performed with the GEPIA2 tool. ssGSEA (single-sample gene set enrichment analysis) was applied to estimate abundances of 28 immune cells for each sample. An unsupervised consensus clustering algorithm was used to identify ISs (immune subtypes). A graph learning-based dimensionality reduction analysis algorithm was utilized to construct an immune landscape. WGCNA (weighted correlation network analysis) was performed to identify immune gene modules.
Four potential tumor antigens, i.e., SLC7A5, CHPF, CCNE1, and CENPW, associated with poor prognosis and APCs (antigen-presenting cells) among overexpressed and mutated genes were identified in BRCA. Two ISs (IS1-2) characterized by distinct clinical, immune cell infiltration, and molecular features were observed in both the TCGA-BRCA and METABRIC cohorts. BRCA patients with IS2 tumors related to poor prognosis had an immune "hot" phenotype, while those patients with IS1 tumors related to superior prognosis had an immune "cold" phenotype. Distinct IS tumors were observed in different ICD (immunogenic cell death modulator) and ICP (immune checkpoint) expression profiles. The immune landscape showed an immune distribution in BRCA patients. Additionally, we identified 2 immune gene modules with different biological functions.
SLC7A5, CHPF, CCNE1, and CENPW are the potential tumor antigens for mRNA vaccine development with BRCA. Patients with IS2 tumors are a suitable population for mRNA vaccination. This study provides a new insight into mRNA vaccine development, population selection for vaccination, and prognosis prediction.
Breast cancer (BRCA) has become the first leading cause of cancer incidence with more than 2.3 million new cases, accounting for 11.7% of global cancer cases in 2020 . Traditional treatments, including surgical resection, chemotherapy, and radiotherapy have greatly prolonged the survival time of early-stage BRCA patients in recent decades. However, the 5-year survival rate of BRCA patients with distant metastasis remains at 30% (https://seer.cancer.gov/statfacts/html/breast.html). Hence, there is an urgent need to find new strategies for improving the therapeutic condition of BRCA.
Cancer immunotherapies eliminate cancer by boosting the host's antitumor response and reshaping the tumor microenvironment (TME). Among them, cancer vaccines have attracted much attention from oncologists owing to prevention and safety. Cancer vaccines are mainly classified into four categories including tumor cell, dendritic cell, DNA, and RNA vaccines, according to their antigen form [2, 3]. Vaccines carrying tumor-associated antigens (TAAs) or tumor-specific antigens (TSAs) that can be recognized, processed, and presented by antigen-presenting cells (APCs) before activating autologous immune cells to induce antitumor effects [4, 5]. Coupled with the clinical successes of mRNA vaccines against the coronavirus disease-2019 (COVID-19) pandemic, cancer mRNA vaccines have again become a hot topic in the cancer therapy field. Compared to other types of cancer vaccines, mRNA vaccines have following major advantages: (1) mRNA vaccines do not provoke insertional mutations because they cannot integrate into the genome . (2) They have a short half-life in vivo since they can be degraded by cellular RNases, which means a favorable safety profile . (3) They can be manufactured cost-effectively and rapidly under a standardized process, which implies a strong responsiveness to public health emergencies . To date, over 50 mRNA vaccines have been used to combat blood cancers, melanoma, glioblastoma, and prostate cancer in clinical trials (https://clinicaltrials.gov/). However, an mRNA vaccine against BRCA has not been developed, and the population suitable for vaccination remains unclear.
This study intended to identify potential tumor antigens for mRNA vaccine development and to construct an immune landscape for selection of a suitable population for vaccination. As shown in the Fig. 1, we identified 4 potential tumor antigens, i.e., SLC7A5, CHPF, CCNE1, and CENPW, associated with poor prognosis and APCs among overexpressed and mutated genes in BRCA. Two immune subtypes (ISs) and two immune gene modules were recognized through the consensus clustering algorithm and weighted correlation network analysis (WGCNA), respectively. Each IS was observed to have distinct clinical, immune cell infiltration, and molecular characteristics. We also constructed the immune landscape for BRCA by a graph learning-based dimensionality reduction analysis to reveal the immune-related gene distribution in individual patients. Overall, our study revealed 4 potential tumor antigens for mRNA vaccine development and found that BRCA patients with IS2 tumors are suitable for vaccination.
Materials and methods
Online tool analyses
To identify tumor antigens and their relationships with OS and RFS, we performed mutation analysis, DEG (differentially expressed gene) analysis, and survival analysis through the online tools cBioPortal tool (cBio Cancer Genomics Portal website, https://www.cbioportal.org/)  and GEPIA 2  (Gene Expression Profiling Interactive Analysis version 2, http://gepia2.cancer-pku.cn/#index), respectively. Data visualization was also performed using those online tools.
In the present study, the cBioPortal tool was used to visualize genetic alterations and to screen mutant genes. The cBioPortal tool integrates multidimensional cancer genomics datasets from multiple cohorts, including the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) and TCGA-BRCA (the Cancer Genome Atlas Breast Cancer) cohorts. In the present study, we pitched on Breast Cancer (METABRIC, Nature 2012 & Nat Commun 2016; 2509 tumor samples with 548 matched normal samples) and Breast Invasive Carcinoma (TCGA, PanCancer Atlas; 1084 tumor samples) datasets in cBioPortal for analyses. The cBioPortal tool automatically mapped 3367 tumor samples with mutation information for gene mutation analysis and 1068 tumor samples with fraction gene altered (FGA) information for FGA analysis.
We used the "differential genes" module of the GEPIA2 tool to identify DEGs. The GEPIA2 tool included the RNA-seq data with survival information for 1085 tumor samples and 112 matched normal samples from the TCGA-BRCA cohort. The "differential genes" module utilizes the analysis of variance (ANOVA) with parameters set to |log2FC| value > 1 and q < 0.01 to identify DEGs. Additionally, the "survival analysis" module of GEPIA2 was used to evaluate the prognostic value of the DEGs. This module uses Kaplan‒Meier curves with log-rank tests and median cutoff DEG profiles to evaluate the relationships between DEGs and overall survival (OS) as well as relapse-free survival (RFS). A P < 0.05 was regarded as statistically significant.
To identify the population suitable for mRNA vaccination, we used the R software (version 4.0.2, https://mirrors.tuna.tsinghua.edu.cn/CRAN/) to perform data analyses and visualization using the downloaded data offline.
Data accession and processing
RNA-seq data and clinical information of BRCA patients in the TCGA-BRCA cohort and METABRIC cohort were downloaded from the TCGA (https://portal.gdc.cancer.gov) database and the cBioPortal website, respectively . For further analysis, we excluded samples with absent or vague survival information. A total of 1110 TCGA-BRCA and 1904 METABRIC samples were included in this study (the clinical characteristics of the cohorts are shown in Additional file 2: Text). The batch effect before merging different expression matrices was removed by the comBat function with the SVA R package  (for more details, see Additional file 2: Fig. S1), and the merged cohort was termed the meta-BRCA cohort. The somatic mutation information of TCGA-BRCA samples detected by WES (whole-exon sequence) was downloaded from TCGA database for TMB (tumor mutation burden) calculation. The somatic mutation information was preprocessed by TCGA database with the VarScan 2 method.
Immune cell abundance estimation
We used single-sample gene set enrichment analysis (ssGSEA) of the GSVA R package to estimate the relative immune cell abundance of each sample with the meta-BRCA cohort  based on its mRNA expression profiles according to a given immune cell gene set . In this study, we systematically retrieved the literature and adopted an immune cell gene set proposed by Beibei Ru et al. . This gene set consists of 742 genes representing 28 immune cells (for more details, see Additional file 1: File S1). The estimate R package was used to evaluate tumor purity . Pearson's correlation analysis was used to examine the relationships between mRNA expression profiles and B-cell, APC (antigen-presenting cell) infiltration, and tumor purity in the meta-BRCA cohort. A P < 0.05 was regarded as statistically significant.
Identification of ISs
First, we downloaded immune-related genes from the Immunology Database and Analysis Portal (ImmPort, https://www.immport.org/shared/genelists) database. These immune-related genes are involved in various immune processes, such as antigen presentation, production of cytokines, activation of interleukin receptors, and so on. Through R software, we extracted 1203 immune-related genes of the gene expression profiles of the meta-BRCA cohort (for more details, see Additional file 1: File S2). Based on those 1203 immune-related genes, we performed unsupervised consensus clustering to identify ISs (immune subtypes) by the ConsensusClusterPlus R package . A total of 1000 bootstraps with 80% item resampling in each bootstrap were used to ensure classification stability. A consensus heatmap and relative change in area under a cumulative distribution function were used to determine the optimal number of clusters.
Somatic genetic variation analysis and CYT score calculation
To determine the TMB, the maftools R package  was used to count the total number of non-synonymous mutations in the TCGA-BRCA cohort and the total number of nonsynonymous mutations in the METABRIC were directly downloaded from the cBioProtal website. After we merged the TMB information of those 2 cohorts, the Wilcoxon test was used to compare the mutation count and TMB between distinct ISs, and a P < 0.05 was considered to be statistically significant. We utilized the oncoplot function of the maftools R package to visualize the top 10 highest mutated genes between distinct ISs. This process was performed using the TCGA-BRCA cohort because of a lack of detailed mutation information for a single gene on the cBioProtal website. To detect the magnitude of the antitumor response with distinct ISs, a cytolytic activity score (CYT) was calculated by the geometrical mean of PRF1 and GZMA mRNA expression profiles for the meta-BRCA cohort . The data were compared by Student's t test, and P < 0.05 was considered statistically significant.
Construction of the BRCA immune landscape
To further uncover the IS distribution with individual patients, we performed graph learning-based dimensionality reduction analysis through discriminative dimensionality reduction with tree (DDRTree) based on the 1203 immune-related gene expression profiles in the meta-BRCA cohort. The plot cell trajectory function of the monocle R package  was used to visualize the immune landscape. We also extended this analysis to reveal the intrinsic IS distribution in individual patients. Similarly, the DDRTree and plot cell trajectory functions were used to perform dimensionality reduction and to visualize the immune landscape, respectively.
Construction and GO functional annotation of IS-associated gene modules
To recognize IS-associated gene modules, we performed WGCNA based on the 1203 immune-related gene expression profiles using the meta-BRCA cohort via the WGCNA R package . We first converted the representation matrix to an adjacency matrix and then to a topological matrix. We used a stepwise method with a minimum of 30 genes for each network following a standard dynamic shear tree to construct a weighted coexpression network. The soft threshold power was set to 2 using the scale-free topology criterion to develop a weighted adjacency matrix. The coexpression modules were recognized by a bottom-up algorithm with a dynamic tree-cut method. Close modules were merged with a standard of height = 0.25, deep split = 2, and min module size = 30. Module eigengenes (MEs) were calculated to quantify modular similarity. Moreover, Gene Ontology (GO) functional annotations were used to explore the genetic functions of the gene modules via the clusterProfiler R package , including biological processes (BP) and molecular functions (MF), and cellular components (CC) analyses. A Benjamini‒Hochberg (BH) adjusted P < 0.05 was regarded as statistically significant.
In the present study, P < 0.05 was considered to be statistically significant. A log-rank test was utilized to compare OS and RFS with specific groups by the survmier R package . The chi-square test was used to examine categorical data. The Wilcoxon test was used to compare the nonnormally distributed data. Comparison of the normally distributed data was performed by Student's t test. Pearson's correlation analysis was performed to compare the relationship between two continuous variables.
Identification of potential tumor antigens in BRCA
To identify potential antigens of BRCA, we first performed a DEG analysis between the tumor and normal tissues. A total of 1418 upregulated genes among 3556 DEGs were identified (for more details, see Additional file 1: File S3). Figure 2a shows a chromosomal distribution of those DEGs. Second, the mutational analysis screened 16,494 mutated genes that potentially encode TAAs (for more details, see Additional file 1: File S4). Figure 2b and c indicates that most BRCA patients had low mutation counts (number of mutational events per case) and FGA (% of copy number altered chromosome regions out of measured regions per case), suggesting low immunogenicity for BRCA. Figure 2d, e show the top 10 most frequently mutated genes in terms of mutation counts and altered genome fractions, respectively. ACR, ADM2, CHKB, CHKB-CPT1B, CHKB-DT, CPT1B, DENND6B, DNAJB6, DNAJB6-AS1, and HDAC10 were the top 10 most frequently mutated genes in terms of mutation count (Fig. 2d). TP53, SCFD2, ARAP3, PCDHB12, HK3, OC90, PIK3CA, LINC02584, LPO, and ANO1 were the top 10 most frequently mutated genes in the altered genome fraction analysis (Fig. 2e). In total, we identified 993 upregulated and frequently mutated tumor-associated genes (for more details, see Additional file 1: File S5).
Relationships among potential tumor antigens with BRCA prognosis, APC abundance and tumor purity
We filtered the prognostic genes among the 993 tumor-associated genes as potential TAAs for developing mRNA vaccines. As depicted in Fig. 3a, there were 10 potential TAAs in 27 OS-related genes that correlated to RFS. Among them, 4 TAAs were related to worse prognoses, i.e., SLC7A5 (solute carrier family 7 member 5; log-rank test; P = 0.0053, Fig. 3b), CHPF (chondroitin polymerizing factor; log-rank test; P = 0.0032, Fig. 3c), CCNE1 (G1/S-specific cyclin-E1; log-rank test; P = 0.0029, Fig. 3d), and CENPW (centromere protein W; log-rank test; P = 0.02, Fig. 3e). The other 6 genes were related to better prognoses, i.e., CXCL9 (C-X-C motif chemokine; log-rank test; P = 0.0054, Additional file 2: Fig. S2a), SKAP1 (src kinase-associated phosphoprotein 1; log-rank test; P = 0.0049, Additional file 2: Fig. S2b), SERPINA1(alpha-1-antiproteinase; log-rank test; P = 0.0017, Additional file 2: Fig. S2c), ZMYND10 (zinc finger MYND domain-containing protein 10; log-rank test; P = 0.049, Additional file 2: Fig. S2d), RSPH1 (radial spoke head 1 homolog; log-rank test; P = 0.021, Additional file 2: Fig. S2e), and TMEM119B (transmembrane protein 119; log-rank test; P = 0.012, Additional file 2: Fig. S2f). In the present study, proteins encoded by the 4 above poor-prognostic genes were considered to be potential tumor antigens for mRNA vaccine development. Furthermore, the expression profiles of CENPW, SLC7A5, CHPF, and CCNE1 were positively associated with activated B-cell and APCs (activated dendritic cell and macrophage) abundance, but negatively with tumor purity (Pearson's correlation analysis; Fig. 4a–d). These findings indicate that these 4 potential tumor antigens (for more details, see Additional file 2: Table S1) may be directly processed and presented by APCs to T cells and recognized by B cells to trigger an antitumor response, reducing the tumor purity to prolong the survival time of BRCA patients. Hence, these 4 potential tumor antigens are promising candidates for developing mRNA vaccines against BRCA.
Identification of potential ISs in BRCA
To detect a suitable population for vaccination, we identified the potential ISs to mirror the immune status and microenvironment of BRCA. Based on the 1203 immune-related gene expression profiles in the meta-BRCA cohort, an unsupervised consensus clustering method was used to identify ISs. According to the tracking plot (Fig. 5a), function delta area (Fig. 5b), cumulative distribution (Fig. 5c), and consensus matrix (Fig. 5d) plots, k = 2 was the best option for ensuring stable clustering. Finally, we obtained 2 ISs termed IS1 and IS2. IS1 tumors were associated with better OS (log-rank test; P = 0.0073, Fig. 5e) and RFS (log-rank test; P < 0.0001, Fig. 5f). These 2 ISs were almost irregularly distributed across the distinct tumor-node-metastasis-stage (TNMS) stages. However, compared with T2, IS1 was significantly more distributed in T1 than IS2 (71% IS1 versus 29% IS2 in T1; 62% IS1 versus 38% IS2 in T2; Wilcoxon test; P = 0.012, Fig. 5g). Compared with N0, IS1 was significantly more distributed in N1 than IS2 (62% IS1 versus 38% IS2 in T1; 72% IS1 versus 28% IS2 in T2; Wilcoxon test; P = 0.007, Fig. 5g). Overall, the ISs we identified can be used to predict the prognosis of BRCA patients, which is independent of TNMS stage.
Relationships between ISs and TMB, CYT score, and potential tumor antigens
Higher TMB and CYT scores are related to stronger anticancer immunity . Therefore, we compared TMB and CYT scores for each patient in the meta-BRCA cohort. As depicted in Fig. 6a, b, both mutation counts (Wilcoxon test; P < 0.0001, Fig. 6a) and the TMB (Wilcoxon test; P < 0.0001, Fig. 6b) of patients with IS2 tumors were markedly higher than those of patients with IS1 tumors. Interestingly, BRCA patients with IS2 tumors also had significantly higher CYT scores than those patients with IS1 tumors (Student's t test; P < 0.0001, Fig. 6c). Figure 6d shows the top 10 most frequently mutated genes in each IS. In addition, the mRNA expression profiles of these 4 potential tumor antigens were higher in patients with IS2 tumors than in patients with IS1 tumors (Student's t test; Fig. 6e). These findings indicate that the ISs we identified can be used to predict the mutation status and cytolytic activity of BRCA.
Associations between ISs and immune modulators
Immunogenic cell death modulators (ICDs) and immune checkpoints (ICPs) play vital roles in cancer immunity [23, 24]. To study the associations between ISs and immune modulators, we analyzed the mRNA expression profiles of the immune modulators with distinct ISs in the TCGA-BRCA and METABRIC cohorts. Nineteen ICD-related genes, including ANXA1, CALR, CXCL10, EIF2A, EIF2AK1, EIF2AK2, EIF2AK3, FPR1, HGF, HMGB1, IFNAR1, IFNAR2, IFNE LRP1, MET, P2RX7, P2RY2, PANX1, and TLR3, were compared in both cohorts, of which 18 (95%) genes in the TCGA-BRCA cohort (Fig. 7a) and 13 (68%) genes in the METABRIC cohort (Fig. 7b) were differentially expressed between IS1 and IS2 tumors. Among those genes, ANXA1, CALR, CXCL10, EIF2AK2, FPR1, HMGB1, IFNAR2, MET, and PANX1 were statistically overexpressed in IS2 tumors, but EIF2A, EIF2AK1, and TLR3 were significantly underexpressed in IS2 tumors (Wilcoxon test; Fig. 7a, b). Forty ICP-related genes including ADORA2A, BTLA, CD160, CD200, CD200R1, CD244, CD27, CD274, CD276, CD28, CD40, CD40LG, CD44, CD48, CD70, CD80, CD86, CTLA4, ICOS, ICOSLG, LAG3, LAIR1, LGALS9, NRP1, PDCD1, PDCD1LG2, TIGIT, TMIGD2, TNFRSF14, TNFRSF18, TNFRSF25, TNFRSF4, TNFRSF8, TNFRSF9, TNFSF14, TNFSF15, TNFSF18, TNFSF4, TNFSF9, and VTCN1, were detected in both cohorts, of which 27 (67.5%) genes in the TCGA-BRCA cohort (Fig. 7c) and 35 (87.5%) genes in the METABRIC cohort (Fig. 7d) were differentially expressed between IS1 and IS2 tumors. Among those genes, BTLA, CD200, CD200R1, CD244, CD27, CD28, CD40, CD40LG, CD48, CD70, CD80, CD86, CTLA4, ICOS, ICOSLG, LAG3, LAIR1, LGALS9, PDCD1, PDCD1LG, TIGIT, TMIGD2, TNFRSF25, TNFRSF4, TNFRSF9, TNFSF14, TNFSF18, and TNFSF9 were significantly upregulated in IS2 tumors in both cohorts, while RNP1 was downregulated in IS2 tumors with the TCGA-BRCA cohort (Wilcoxon test; Fig. 7c, d). Therefore, the ISs we identified can mirror the mRNA expression profile of ICDs and ICPs.
Immune cellular characteristics of ISs
Tumor immune cell infiltration status determines response to mRNA vaccines to a large extent. Therefore, we further characterized the tumor immune cell components of the IS1-2 tumors by estimating the relative abundances of 28 immune cell types in both the TCGA-BRCA and METABRIC cohorts. Figure 8a comprehensively depicts the whole landscape of 28 immune cell relative abundances of IS1-2 tumors in both cohorts. IS2 tumors were observed to have a greater immune cellular distribution than IS1 tumors in both cohorts. In the TCGA-BRCA cohort, almost all of those 28 immune cells, except for CD56 bright natural killer cells, eosinophils, mast cells, memory B cells, and plasmacytoid dendritic cells, were significantly differentially infiltrated in distinct ISs. Only the central memory CD8 cells more sparsely infiltrated the IS2 tumors than the IS1 tumors among these 28 immune cells (Wilcoxon test; Fig. 8b). In the METABRIC cohort, only eosinophils, mast cells, memory B cells, and plasmacytoid dendritic cells remained unremarkably differentially infiltrated in distinct ISs (Wilcoxon test; Fig. 8c). In line with the TCGA-BRCA cohort results, only CD56 bright natural killer cells and central memory CD8 cells were more sparsely infiltrated the IS2 tumors than IS1 tumors. Overall, the IS2 tumors were distributed with more immune cells characterizing an immune "hot" phenotype. In contrast, IS1 tumors infiltrated sparse immune cells, indicating an immune "cold" phenotype.
The immune landscape of BRCA
To further visualize the immune distribution of individual patients, we constructed an immune landscape of BRCA through graph learning-based dimensionality reduction analysis. Patients with IS1 tumors and IS2 tumors distributed in opposite directions on the immune landscape (Fig. 9a). We also observed that the PC (principal component) 1 correlated negatively with all immune cells, especially type 1 T helper cells, T follicular helper cells, and natural killer T cells but that PC2 was only negatively linked to 9 immune cells, including type 17 T helper cells, neutrophils, monocytes, immature B cells, activated dendritic cells, CD56 dim natural killer cells, activated CD8 T cells, activated CD4 cells, and activated B cells. In addition, PC2 was positively related to type 1 helper cells, T follicular helper cells, regulator T cells, plasmacytoid dendritic cells, natural killer cells, memory B cells, mast cells, macrophages, immature dendritic cells, eosinophil cells, effector memory CD8 T cells, central memory CD4 T cells, central memory CD8 T cells, and CD56 bright natural killer cells (Pearson's correlation analysis; Fig. 9b). Furthermore, we found the intracluster heterogeneity of ISs in the immune landscape. We further divided IS1 into IS1A-B and divided IS2 into IS2A-B according to the patient's location on the immune landscape (Fig. 9c). These intraclusters also exhibited distinct immune cellular infiltrating characteristics. Almost all immune cells, except for CD56 dim natural killer cells, showed remarkably different abundances between IS1A and IS1B tumors. Of note, the IS1A tumors were relatively infiltrated with sparse immune cells, except for monocytes and neutrophils (Wilcoxon test; Fig. 9d). Similarly, most immune cells, except for CD56 bright natural killer cells and central memory CD8 T cells, significantly differentially infiltrated the IS2A and IS2B tumors. All immune cells infiltrated the IS2B tumors relatively sparsely compared with IS2A tumors (Wilcoxon test; Fig. 9e). Moreover, we also observed prolonged survival times in Cluster 3 patients compared with Cluster 1 and Cluster 2 patients in the immune landscape (log-rank test; P = 0.047, Fig. 9f, g). These findings suggested that the location of patients in the immune landscape can predict their prognosis. Overall, the immune landscape can not only effectively reflect the immune cell infiltration status of BRCA patients but can also predict their prognosis.
Construction IS-associated gene modules of BRCA
IS-associated gene modules were constructed by the WGCNA algorithm with a soft threshold of 2 for a scale-free network (Fig. 10a, b). We obtained 2 coexpression modules with a total of 458 genes, of which 132 clustered into the blue module and 326 into the turquoise module. The remaining 745 genes are marked in gray without being clustered into any module according to the documentation of the WGCNA R package which was not displayed in the graph in Fig. 10c (for more details, see Additional file 1: File S6). We also analyzed the IS distribution in the module eigengenes of those 2 modules. The module eigengenes of the IS1 tumor were significantly higher than those of the IS2 tumor in the blue module, while this trend was opposite in the turquoise module (Wilcoxon test; P < 0.0001, Fig. 10d). Thus, we hypothesized that these 2 modules may be related to distinct biological functions. To further validate this issue, we performed GO functional annotations for these 2 modules. We found that the blue module refers more to signaling transformations, for example, positive regulation of phosphatidylinositol 3-kinase signaling in GO BP analysis and transforming growth factor beta-activated receptor activity in GO MF analysis (Fig. 11a). The turquoise module was related to immune related functions, such as the positive regulation of T-cell proliferation in GO BP analysis and MHC class II protein complex in GO CC analysis (Fig. 11b). Moreover, the turquoise module (Pearson's correlation analysis; correlation coefficient = − 0.92, P = 2.2e−16, Fig. 11d) was more closely related to the immune landscape than the blue module (Pearson's correlation analysis; correlation coefficient = − 0.34, P = 2.2e−16, Fig. 11c). In addition, prognostic analyses of those 2 modules revealed that only the turquoise module was related to OS (log-rank test; P = 0.22, Fig. 11e; P = 0.011, Fig. 11f). The therapeutic potential of mRNA vaccines in BRCA patients was largely determined by the immune functions of highly expressed genes. Therefore, an mRNA vaccine might be suitable for BRCA patients with highly expressed genes in the turquoise module. The turquoise module can also be used to predict the prognosis of BRCA patients.
BRCA is one of the most commonly diagnosed malignancies, with a high level of biological and prognostic heterogeneity. Although various treatments can prolong survival time, the prognoses of relapsing and metastatic patients remain unsatisfactory. Proteins encoded by mutated genes differ from wild-type proteins and can be recognized by immune cells to remove cancer cells and act as tumor antigens . In general, the immune response can lead to reduction or loss of those tumor antigens and induce immune escape . Moreover, the immunogenicity of those tumor antigens can be reduced by coated substances, such as salivary mucopolysaccharides . Therefore, amplification of tumor antigens with exogenous mRNA vaccines to reactivate the host antitumor response is a good strategy.
In this study, we identified 4 potential tumor antigens, i.e., SLC7A5, CHPF, CCNE1, and CENPW, for mRNA vaccine development in BRCA. They were not only negatively related to survival time and tumor purity but also positively related to APCs and B-cell infiltration. Although no experiments have directly indicated that these molecules can serve as tumor antigens in vitro or in vivo, an increasing number of studies have reported their vital roles in BRCA progression, prognosis, and treatment. For example, SLC7A5 is an amino acid transporter for the uptake of leucine in cells, which is critical for metabolic activation and cellular functions . Engineering chimeric antigen receptor (CAR)-T cells to overexpress SLC7A5 can enhance CAR-T -cell recognition of tumor cells . CHPF promotes breast carcinoma cell proliferation, invasion, and migration via upstream TGF- β1/SMAD3 and JNK axis activation . CHPF can also promote malignancy in BRCA by reshaping the TME . CCNE1 is an important cyclin protein that is related to worse clinical outcomes in BRCA patients . Targeting CCNE1, a cyclin-dependent kinase (CDK) 4/6 inhibitor (palbociclib) was developed to block cell cycle progression to reduce tumor cell proliferation in BRCA . CENPW is associated with worse prognosis in BRCA, and knocking down CENPW can inhibit the proliferation and migration of BRCA cells . Liu et al.  also identified three potential tumor antigens, i.e., CD74, IRF1, and PSME2, for mRNA vaccine development in BRCA. Contrary to our findings, all of those tumor antigens were related to better prognosis. The immune response triggered by those tumor antigens may result in their downregulation to the original levels and in turn shorten the survival time of BRCA patients, which is the opposite of the aim of vaccination. Therefore, the 4 worse prognostic potential tumor antigens we identified can provide a credible guide to mRNA vaccine development.
Since a limited portion of BRCA patients with specific ISs can benefit from mRNA vaccines [36, 37], we divided BRCA patients into 2 ISs via an unsupervised consensus clustering algorithm based on immune-related genes for segmenting the appropriate population for vaccination. BRCA patients with IS1-2 in both the TCGA-BRCA and METABRIC cohorts were observed to have different prognostic, clinical, molecular, and immune cell infiltrating characteristics, which indicates a different clinical response to a mRNA vaccine. In the present study, BRCA patients with IS1 tumors characterized by sparse immune cell infiltration had better prognosis than those with IS2 tumors characterized by abundant immune cell infiltration, in line with our previous study . Therefore, the dominance of the immune-suppressive environment or stimulatory environment is the decisive factor for prognosis . IS2 tumors still had higher TMB and mutation numbers, which implies a greater heterogeneity and was easier recognition by immune cells. IS2 tumors were characterized by more infrequent TP53 and less infrequent PIK3CA mutations than IS1 tumors. BRCA patients with mutant TP53 and PIK3CA have poor prognosis, consistent with our study [40,41,42]. Additionally, a higher CYT score for the IS2 tumors implies that more tumor cells were attacked by immune cells. IS2 tumors also displayed higher expression of ICPs and ICDs than IS1 tumors in both the TCGA-BRCA and METABRIC cohorts, which suggests that patients with IS2 tumors will benefit more from both immune checkpoint inhibitor and mRNA vaccine treatments. Therefore, we propose that BRCA patients with IS2 tumors are more suitable for mRNA vaccination than those with IS1 tumors. Patients with IS2 tumors who receive treatments combing immune checkpoint inhibitors with mRNA vaccination may obtain better curative effects.
To date, many studies have identified tumor categories based on various aspects of tumor biology. BRCA is commonly classified into three main subtypes according to the status of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2), guiding the choice of medical treatments, including endocrine and anti-HER2 therapies . However, those categories do not guide the mRNA vaccination since they cannot represent the complex TME in BRCA. Therefore, we did not investigate the distribution of ISs in those categories. Thorsson et al. divided 33 cancer types into six ISs (C1-6) based on the TME and prognosis . We found that the major C1 (wound healing) and C2 (INF-γ dominant) categories characterized by abundant immune cell infiltration and favorable prognoses were clustered as IS2 tumors but that the C4 (lymphocyte depleted) and C6 (TGF-β dominant) categories with sparse immune cell infiltration and worse prognoses were clustered as IS1 tumors. Contrary to their prognostic roles, BRCA patients with IS1 tumors tended to have better prognosis than patients with IS2 tumors. Therefore, our division of the TME is different from previous classifications and provides a useful complement to classification of the TME. To further reveal the intracluster heterogeneity of ISs, we performed graph learning-based dimensionality reduction of immune-related gene profiles according to previous studies [45, 46]. We observed that patients distributed in three different directions on the graph have different prognoses. Comparison of intraclusters revealed that almost all immune cells infiltrated IS1B and IS2A more abundantly than IS1A and IS2B, which indicates that BRCA patients accepting mRNA vaccine treatments with distinct intraclusters of ISs may still have different responses and clinical outcomes. Thus, a more precise methodology for defining the TME is still needed in future studies.
Our study also has several limitations. First, we only identified the potential tumor antigens based on RNA-seq data without validation of their response because no clinical data for mRNA vaccine use in BRCA have been published thus far. Second, further applications of ISs in clinical practice are difficult since the gene expression profile for a patient should be used. Further in vivo and vitro studies are needed in this field.
In conclusion, SLC7A5, CHPF, CCNE1, and CENPW are potential tumor antigens for mRNA vaccine development. BRCA patients with IS2 tumors may benefit from vaccination. Our study provides a foundation for mRNA vaccine development, population selection for vaccination, and prognosis prediction.
Availability of data and materials
All data generated and described in this study are available from the corresponding websites and are freely available to any scientist with noncommercial purposes. Further information is available from the corresponding author upon reasonable request.
Antigen presenting cells
Weighted correlation network analysis
Fragments per kilobase million
The Cancer Genome Atlas
Molecular Taxonomy of Breast Cancer International Consortium
CBio Cancer Genomics Portal website
Gene Expression Profiling Interactive Analysis version 2
Differentially expressed genes
Analysis of variance
Single-sample gene set enrichment analysis
Tumor mutation burden
Discriminative dimensionality reduction with tree
Immunogenic cell death modulator
Immunology database and analysis portal database
- ER :
- PR :
- HER2 :
Human epidermal growth factor receptor 2
Whole exon sequence
Fraction gene altered
Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49. https://doi.org/10.3322/caac.21660.
Faghfuri E, Pourfarzi F, Faghfouri AH, et al. Recent developments of RNA-based vaccines in cancer immunotherapy. Expert Opin Biol Ther. 2021;21(2):201–18. https://doi.org/10.1080/14712598.2020.1815704.
Mockey M, Bourseau E, Chandrashekhar V, et al. mRNA-based cancer vaccine: prevention of B16 melanoma progression and metastasis by systemic injection of MART1 mRNA histidylated lipopolyplexes. Cancer Gene Ther. 2007;14(9):802–14. https://doi.org/10.1038/sj.cgt.7701072.
Coulie PG, Van den Eynde BJ, van der Bruggen P, et al. Tumour antigens recognized by T lymphocytes: at the core of cancer immunotherapy. Nat Rev Cancer. 2014;14(2):135–46. https://doi.org/10.1038/nrc3670.
van der Burg SH. Correlates of immune and clinical activity of novel cancer vaccines. Semin Immunol. 2018;39:119–36. https://doi.org/10.1016/j.smim.2018.04.001.
Pardi N, Hogan MJ, Porter FW, et al. mRNA vaccines—a new era in vaccinology. Nat Rev Drug Discov. 2018;17(4):261–79. https://doi.org/10.1038/nrd.2017.243.
Luo W, Yang G, Luo W, et al. Novel therapeutic strategies and perspectives for metastatic pancreatic cancer: vaccine therapy is more than just a theory. Cancer Cell Int. 2020;20:66. https://doi.org/10.1186/s12935-020-1147-9.
Cerami E, Gao J, Dogrusoz U, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2(5):401–4. https://doi.org/10.1158/2159-8290.Cd-12-0095.
Tang Z, Kang B, Li C, et al. GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res. 2019;47(W1):W556–60. https://doi.org/10.1093/nar/gkz430.
Pereira B, Chin SF, Rueda OM, et al. The somatic mutation profiles of 2433 breast cancers refines their genomic and transcriptomic landscapes. Nat Commun. 2016;7:11479. https://doi.org/10.1038/ncomms11479.
Leek JT, Johnson WE, Parker HS, et al. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28(6):882–3. https://doi.org/10.1093/bioinformatics/bts034.
Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 2013;14:7. https://doi.org/10.1186/1471-2105-14-7.
Barbie DA, Tamayo P, Boehm JS, et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature. 2009;462(7269):108–12. https://doi.org/10.1038/nature08460.
Ru B, Wong CN, Tong Y, et al. TISIDB: an integrated repository portal for tumor-immune system interactions. Bioinformatics. 2019;35(20):4200–2. https://doi.org/10.1093/bioinformatics/btz210.
Becht E, Giraldo NA, Lacroix L, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 2016;17(1):218. https://doi.org/10.1186/s13059-016-1070-5.
Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. 2010;26(12):1572–3. https://doi.org/10.1093/bioinformatics/btq170.
Mayakonda A, Lin DC, Assenov Y, et al. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018;28(11):1747–56. https://doi.org/10.1101/gr.239244.118.
Rooney MS, Shukla SA, Wu CJ, et al. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell. 2015;160(1–2):48–61. https://doi.org/10.1016/j.cell.2014.12.033.
Qiu X, Hill A, Packer J, et al. Single-cell mRNA quantification and differential analysis with Census. Nat Methods. 2017;14(3):309–15. https://doi.org/10.1038/nmeth.4150.
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 2008;9:559. https://doi.org/10.1186/1471-2105-9-559.
Yu G, Wang LG, Han Y, et al. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7. https://doi.org/10.1089/omi.2011.0118.
Therneau TM. A Package for Survival Analysis in R. 2021.
Morad G, Helmink BA, Sharma P, et al. Hallmarks of response, resistance, and toxicity to immune checkpoint blockade. Cell. 2021;184(21):5309–37. https://doi.org/10.1016/j.cell.2021.09.020.
Kroemer G, Galluzzi L, Kepp O, et al. Immunogenic cell death in cancer therapy. Annu Rev Immunol. 2013;31:51–72. https://doi.org/10.1146/annurev-immunol-032712-100008.
Leko V, Rosenberg SA. Identifying and targeting human tumor antigens for T cell-based immunotherapy of solid tumors. Cancer Cell. 2020;38(4):454–72. https://doi.org/10.1016/j.ccell.2020.07.013.
Girard-Pierce KR, Stowell SR, Smith NH, et al. A novel role for C3 in antibody-induced red blood cell clearance and antigen modulation. Blood. 2013;122(10):1793–801. https://doi.org/10.1182/blood-2013-06-508952.
Zhou Q, Yan X, Zhu H, et al. Identification of three tumor antigens and immune subtypes for mRNA vaccine development in diffuse glioma. Theranostics. 2021;11(20):9775–90. https://doi.org/10.7150/thno.61677.
Nachef M, Ali AK, Almutairi SM, et al. Targeting SLC1A5 and SLC3A2/SLC7A5 as a potential strategy to strengthen anti-tumor immunity in the tumor microenvironment. Front Immunol. 2021;12: 624324. https://doi.org/10.3389/fimmu.2021.624324.
Panetti S, McJannett NJ, Fultang L, et al. Engineering amino acid uptake or catabolism promotes CAR-T cell adaption to the tumour environment. Blood Adv. 2022. https://doi.org/10.1182/bloodadvances.2022008272.
Pan QF, Ouyang WW, Zhang MQ, et al. Chondroitin polymerizing factor predicts a poor prognosis and promotes breast cancer progression via the upstream TGF-β1/SMAD3 and JNK axis activation. J Cell Commun Signal. 2022. https://doi.org/10.1007/s12079-022-00684-0.
Liao WC, Yen HR, Chen CH, et al. CHPF promotes malignancy of breast cancer cells by modifying syndecan-4 and the tumor microenvironment. Am J Cancer Res. 2021;11(3):812–26.
Sutherland RL, Musgrove EA. Cyclins and breast cancer. J Mammary Gland Biol Neoplasia. 2004;9(1):95–104. https://doi.org/10.1023/b:Jomg.0000023591.45568.77.
Turner NC, Liu Y, Zhu Z, et al. Cyclin E1 expression and palbociclib efficacy in previously treated hormone receptor-positive metastatic breast cancer. J Clin Oncol. 2019;37(14):1169–78. https://doi.org/10.1200/jco.18.00925.
Wang L, Wang H, Yang C, et al. Investigating CENPW as a novel biomarker correlated with the development and poor prognosis of breast carcinoma. Front Genet. 2022;13: 900111. https://doi.org/10.3389/fgene.2022.900111.
Li RQ, Wang W, Yan L, et al. Identification of tumor antigens and immune subtypes in breast cancer for mRNA vaccine development. Front Oncol. 2022;12: 973712. https://doi.org/10.3389/fonc.2022.973712.
Han S, Lee SY, Wang WW, et al. A perspective on cell therapy and cancer vaccine in biliary tract cancers (BTCs). Cancers (Basel). 2020. https://doi.org/10.3390/cancers12113404.
Xu JL, Guo Y. FCGR1A serves as a novel biomarker and correlates with immune infiltration in four cancer types. Front Mol Biosci. 2020;7: 581615. https://doi.org/10.3389/fmolb.2020.581615.
Li L. Tumor microenvironment characterization in breast cancer and an immune cell infiltration score development, validation, and application. Front Oncol. 2022;12: 844082. https://doi.org/10.3389/fonc.2022.844082.
Huang X, Tang T, Zhang G, et al. Identification of tumor antigens and immune subtypes of cholangiocarcinoma for mRNA vaccine development. Mol Cancer. 2021;20(1):50. https://doi.org/10.1186/s12943-021-01342-6.
Bergh J, Norberg T, Sjögren S, et al. Complete sequencing of the p53 gene provides prognostic information in breast cancer patients, particularly in relation to adjuvant systemic therapy and radiotherapy. Nat Med. 1995;1(10):1029–34. https://doi.org/10.1038/nm1095-1029.
Iwaya K, Tsuda H, Hiraide H, et al. Nuclear p53 immunoreaction associated with poor prognosis of breast cancer. Jpn J Cancer Res. 1991;82(7):835–40. https://doi.org/10.1111/j.1349-7006.1991.tb02710.x.
Mosele F, Stefanovska B, Lusque A, et al. Outcome and molecular landscape of patients with PIK3CA-mutated metastatic breast cancer. Ann Oncol. 2020;31(3):377–86. https://doi.org/10.1016/j.annonc.2019.11.006.
Zhu SY, Yu KD. Breast cancer vaccines: disappointing or promising? Front Immunol. 2022;13: 828386. https://doi.org/10.3389/fimmu.2022.828386.
Thorsson V, Gibbs DL, Brown SD, et al. The immune landscape of cancer. Immunity. 2018;48(4):812-30.e14. https://doi.org/10.1016/j.immuni.2018.03.023.
Xu H, Zheng X, Zhang S, et al. Tumor antigens and immune subtypes guided mRNA vaccine development for kidney renal clear cell carcinoma. Mol Cancer. 2021;20(1):159. https://doi.org/10.1186/s12943-021-01465-w.
Huang X, Zhang G, Tang T, et al. Identification of tumor antigens and immune subtypes of pancreatic adenocarcinoma for mRNA vaccine development. Mol Cancer. 2021;20(1):44. https://doi.org/10.1186/s12943-021-01310-0.
We thank those researchers who shared their datasets, online tools, and R packages used in this study.
Ethics approval and consent to participate
Consent for publication
The authors have declared that no competing interests exist.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
File S1. The immune cell gene set. File S2. List of 1203 immune-related genes. File S3. List of differentially expressed genes. File S4. List of mutated genes. File S5. List of over-expressed and frequently mutated tumor-associated genes. File S6. The gene components of distinct gene modules.
The clinical characteristics of cohorts used in the present study. Fig. S1. The 3D scatter plots showing the gene expression distribution PCA before (a) and after (b) batch effect correction. Fig. S2. Kaplan‒Meier curves showing the other 6 favorable prognostic genes, i.e., (a) CXCL9, (b) SKAP1, (c) SERPINA1, (d) ZMYN10, (e) RSPH1, and (f) TMEM229B. Table. S1. The parameters of 4 potential tumor antigens.
About this article
Cite this article
Li, L., He, L. & Zhu, Y. Dissecting tumor antigens and immune subtypes for mRNA vaccine development in breast cancer. J Big Data 10, 149 (2023). https://doi.org/10.1186/s40537-023-00826-0