br Global Correlation and Overlaps of Protein and Transcript
Global Correlation and Overlaps of Protein and Transcript Level Expression
Abundance of our three key MDMB-CHMCZCA M3 INPP4B, CDK1, and ERBB2 across tumors of different ER status, tumor grade, and HER2 status correlated well with their respective transcript levels. However, when looking at all differentially expressed proteins and transcripts in our dataset, the overlap and correla-tion of fold changes was modest. Although it has been shown that protein levels are chiefly determined by transcript levels, particularly in steady state (Schwanha¨usser et al., 2011), and that fold changes of transcript and protein levels between different human cell lines can show correlations as high as R = 0.63 (Lundberg et al., 2010), our comparisons of transcript and protein data suggest that this correlation is relatively low in human breast cancer tissues (R = 0.29). In general, the limited correlation between protein and transcript levels provides a substantial reason to focus on the analysis of proteins instead of transcripts, as these represent the true molecular effectors in cells.
This study explored and confirmed the suitability of SWATH-MS for proteotyping of human tumor samples at relatively high throughput. Although larger patient cohorts are needed for validation of the classifier, our results indicate that proteotype-based classification resolves more breast cancer subtypes than apparent from conventional subtyping and potentially improves current classification. Furthermore, the potential of data-independent approaches, such as SWATH-MS, for tissue classification is not limited to breast cancer but valid for other diseases and clinical specimens. Although we are not yet at a point to make clinical decisions based on proteotype data, our study may motivate further the research that in turn may result in more adequate treatment and better clinical outcomes. The breast cancer SWATH assay library and the high-quality prote-omics dataset of 96 breast tumors will provide a valuable resource for future protein marker studies.
Detailed methods are provided in the online version of this paper and include the following:
d KEY RESOURCES TABLE
d LEAD CONTACT AND MATERIALS AVAILABILITY d EXPERIMENTAL MODEL AND SUBJECT DETAILS
B Study design
B Clinical tissue samples
d METHOD DETAILS
B Tissue quality control via RNA integrity measurement B Proteomics sample preparation
B LC-MS analyses for spectral library generation B LC-MS analyses in SWATH-MS mode
B TP53 sequencing
B ERBB2 immunohistochemistry
B Validation of SWATH-MS quantitation through
selected reaction monitoring
d QUANTIFICATION AND STATISTICAL ANALYSIS B SWATH-MS assay library generation
B SWATH-MS data processing in OpenSWATH B Statistical analysis
B Relative quantification with MSstats and differential protein expression analysis between subtypes and related clinical-pathological variables
B KEGG pathway analysis
B Gene set enrichment analysis
B Correlation analysis of breast cancer tissue proteomes B Construction of the decision tree
B Analysis of ERBB2 gene expression in the same sam-ple set
B Analysis of gene expression in independent microarray and RNA-Seq sets of samples
B Analysis of patient survival
B Statistical analysis of the IHC data
B Correlation analysis of SWATH-MS and SRM quantita-
d DATA AND CODE AVAILABILITY
Supplemental Information can be found online at https://doi.org/10.1016/j.
We thank the women who provided their tissue for this research and all of the clinically related staff involved in their treatment. We are grateful to Dr. Hannes Ro¨st and Dr. George Rosenberger for their help with SWATH-MS data analysis, to Dr. Ben Collins for the script to generate Data S3C, to Anna Pacinkova for preparing Figure 4D based upon data generated by the TCGA Research Network (http://www.cancer.gov/about-nci/organization/ ccg/research/structural-genomics/tcga?redirect=true), to Anna Najbrtova for data processing in Data S5A–S5C, to Dr. Josef Planeta for nano-LC column preparation, and to Dr. Philip J. Coates for critical reading of the manuscript. This work was supported by the Czech Science Foundation (project no. 17-05957S). Publication fee was covered by Grant Agency of Masaryk University. J.F. was supported by MEYS - NPS I - LO1413, R.N. by MH CZ - DRO (MMCI; 00209805), and E.B. by the CETOCOEN PLUS project and the RECETOX Research Infrastructure (LM2015051). R.A. was supported by the Swiss Na-tional Science Foundation (SNSF) (31003A_166435) and European Research Council grant 20140AdG 670821.