NCI Hub - Group: Physical Sciences in Oncology ~ Wiki: PSON0008 Proteomic Characterization of Cell Lines

Study Title Proteomic Characterization of Cell Lines

This page: https://nciphub.org/groups/nci_physci/wiki/PSON0008

Download the dataset at ftp://caftpd.nci.nih.gov/psondcc/PhysicalCharacterization/Proteomics
The project’s entry at Synapse.org https://www.synapse.org/#!Synapse:syn9697791

Study Contact Parag Mallick; email: paragm@stanford.edu

OVERVIEW

The Janmey lab at the University of Pennsylvania grew each of 9 cell lines from Physical Sciences-Oncology Network Bioresource Core Facility (PBCF) https://physics.cancer.gov/bioresources/ on each of seven growth matrices for a total of 63 different samples.

Approximately nine million cells were initially plated (across 6 – 9 plates) for each condition. Cells were grown for 24 hours. After 24 hours, cells were isolated, pelleted, flash frozen and then transferred to Stanford for proteomics analysis. Proteins were extracted from the pellets. Next, a small portion of this was precipitated and used for QC analysis. Specifically, protein amount was estimated by MicroBCA. A Coomassie stained gel was also run to verify that there was no significant contamination or degradation. Next, allocations for each of the downstream assays were determined and protein aliquoted. Each of these aliquots was precipitated and then directed to its subsequent assay (iBAQ, TMT and Phospho), as detailed below.

Figure 1 . Growth and transfer of cell line extracts

iBAQ OVERVIEW

The goal iBAQ analysis was twofold. First, we wanted to generally verify that the sample was of sufficient quality to support broader proteomics analysis. Second, we wanted to generate data that might be used for either spectral count or iBAQ analysis to support absolute quantification.

Sample Preparation and Mass Spectrometric Analysis Workflow

Our overall iBAQ process begins with the precipitated protein generated as described above. This precipitate was re-suspended and then split into three aliquots. To facilitate iBAQ analysis, UPS2, a standard protein mixture (Sigma) was spiked into one of the aliquots. We denote this aliquot plusUPS2. As downstream analysis of these aliquots was performed independently they serve as sample preparation triplicates, which we refer to as preplicates. As shown in Figure 2, each preplicate was processed using our iBAQ protocol to generate tryptic peptides. Duplicate injections of these peptides were analyzed by a Thermo LTQ Orbitrap Velos generating a family of approximately 6 raw files per condition.

Figure 2. iBAQ Sample Preparation Overview

iBAQ ANALYSIS

Raw instrument files (*raw) are converted to an intermediary file (mzXML format) using ProteoWizard msConvert http://proteowizard.sourceforge.net. Next, files are uploaded to Labkey Server (http://www.labkey.org). Labkey server uses a typical Transproteomic Pipeline Process to match MS data to peptides and proteins from the Human2015.fasta FASTA Database. Results are stored as open-format pepXML and protXML files. We have additionally used Labkey server to convert these results into a simple .tsv file.

Figure 3. iBAQ Data Analysis Overview

PHOSPHO OVERVIEW

Our overall phosphopeptide process begins with the precipitated protein generated as described above. This precipitate was re-suspended and then split into aliquots. 0-3 aliquots were generated depending upon sample amount. As above, downstream analysis of these aliquots was performed independently. Consequently, they serve as sample preparation replicates (when possible), which we refer to as preplicates. Each preplicate was processed using our phospho-enrichment protocol to generate an enriched pool of phosphorylated tryptic peptides. Briefly, between 250 and 750ug of protein is digested with trypsin into peptides. These peptides are then run through a TiO2 column to enrich for phosphopeptides and then a graphite column to clean and desalt the peptides. Enriched phosphopeptides were analyzed by a Thermo Orbitrap Fusion to generate a family of up to 3 preplicate raw files per condition.

Figure 4. Phospho Sample Preparation Overview

PHOSPHO ANALYSIS

Phospho data was analyzed by Proteome Discoverer software http://www.thermoscientific.com/en/product/proteome-discoverer-software.html to match MS data to peptides and proteins from the Uniprot_HUM_031914_cRAP.fasta FASTA file. Proteome discoverer produces an .msf file which is converted to an easily readable tsv file. The .msf (Magellan Storage File) extension files are SQLite files generated by the Proteome Discoverer. The .msf format is another file format (like pep.xml, prot.xml) to capture all the relevant output information from the Proteome Discoverer search results.

Figure 5. Phsopho Analysis Overview

TMT OVERVIEW

The goal of Tandem-Mass-Tag (TMT) analysis is to provide deep relative quantification across conditions. Unlike the phospho and iBAQ studies where there are RAW files for each condition, TMT data spans multiple substrates.

TMT Sample Preparation and Mass Spectrometric Analysis Workflow

Our overall TMT workflow begins with the precipitated protein generated as described above. The precipitate for each condition was re-suspended and then split into 3 preplicate aliquots. For each preplicate, the 7 substrate conditions are digested into peptides and then labeled with a TMT reagent. For example, in the figure below, substrate S1, is labeled with TMT-127; substrate S2 is labeled with TMT-128, etc.

Next, the 7 vials of labeled peptides are combined with a reference lysate (a pool of 4 cell lines) and two internal replicates to make a TMT-10-plex mixture. This 10-plex mixture is fractionated to achieve greater depth. The TMT-10-plex is fractionated by high-PH reverse phase fractionation into 8 reverse-phase fractions (RP Fractions). These 8 fractions are subsequently re-combined into 3 fractions. Fractions were re-combined for cost and time reasons. Recombination was done as follows: RP fractions 1,4 and 7 were pooled to make one pooled sample; RP fractions 2,5,8 were pooled to make the second sample; RP fractions 3 and 6 were pooled to make the third pooled sample. These samples were analyzed by a Thermo Orbitrap Fusion to generate three raw files per cell line preplicate.

Figure 6. TMT Sample Preparation Overview

TMT ANALYSIS

TMT data was analyzed by Proteome Discoverer software (Link) to match MS data to peptides and proteins from the Uniprot_HUM_031914_cRAP.fasta FASTA Database. Proteome discoverer produces an .msf file which is converted to an easily readable tsv file. The .msf (Magellan Storage File) extension files are SQLite files generated by the Proteome Discoverer. The .msf format is another file format (like pep.xml, prot.xml) to capture all the relevant output information from the Proteome Discoverer search results.

Figure 7. TMT Analysis Overview

OTHER REFERENCES

Proteome Discoverer User Guide https://tools.thermofisher.com/content/sfs/manuals/Man-XCALI-97506-Proteome-Discoverer-14-User-ManXCALI97506-A-EN.pdf

Quantification with Proteome Discoverer https://tools.thermofisher.com/content/sfs/manuals/Quantification-with-Proteome-Discoverer-1-2.pdf

Descriptions of proteome discoverer column results for phospho samples can be found at: https://sites.psu.edu/msproteomics/tag/psm/

Data usage policy

The data contained within the PS-ON DCC is based on several research projects and is intended to be rapidly and constantly updated for the research community to access and use. The NCI requests that any data users:

Inform the data submitters about the intention to submit a publication that uses PS-ON DCC data. Include the following statement in any publications resulting from the use of PS-ON DCC data: Data used in this publication were generated by projects sponsored by the NCI Physical Sciences in Oncology Initiative.

Created on 29 Jun 2017, Last modified on 01 Oct 2017