'''Study Title''' Proteomic Characterization of Cell Lines This page: [https://nciphub.org/groups/nci_physci/wiki/PSON0008] '''Study Contact''' Parag Mallick; email: paragm@stanford.edu '''Overview''' The Janmey lab at the University of Pennsylvania grew each of the 9 cell lines on each of the 7 growth matrices for a total of 63 different conditions. Approximately 9 million cells were initially plated (across 6-9 plates) for each condition. Cells were grown for 24 hours. After 24 hours, cells were isolated, pelleted, flash frozen and then transferred to Stanford for proteomics analysis. Protein was extracted from the pellets. Next, a small portion of this was precipitated and used for QC analysis. Specifically, protein amount was estimated by !MicroBCA. A Coomassie stained gel was also run to verify that there was no significant contamination or degradation. Next, allocations for each of the downstream assays were determined and protein aliquoted. Each of these aliquots was precipitated and then directed to its subsequent assay (iBAQ, !Phospho, TMT), as detailed below. '''iBAQ Analysis Description''' The goal iBAQ analysis was twofold. First, we wanted to generally verify that the sample was of sufficient quality to support broader proteomics analysis. Second, we wanted to generate data that might be used for either spectral count or iBAQ analysis to support absolute quantification. Sample Preparation and Mass Spectrometric Analysis Workflow Our overall iBAQ process begins with the precipitated protein generated as described above. This precipitate was re-suspended and then split into three aliquots. To facilitate iBAQ analysis, UPS2, a standard protein mixture (Sigma) was spiked into one of the aliquots. We denote this aliquot plusUPS2. As downstream analysis of these aliquots was performed independently they serve as sample preparation triplicates, which we refer to as preplicates. As shown on Page 3, each preplicate was processed using our iBAQ protocol to generate tryptic peptides. Duplicate injections of these peptides were analyzed by a Thermo LTQ Orbitrap Velos generating a family of approximately 6 .RAW files per condition. 2.3 iBAQ Computational Analysis Workflow Raw instrument files (*raw) are converted to an intermediary file (mzXML format) using !ProteoWizard msConvert [http://proteowizard.sourceforge.net]. Next, files are uploaded to Labkey Server (http://www.labkey.org). Labkey server uses a typical Transproteomic Pipeline Process to match MS data to peptides and proteins from the Human2015.fasta FASTA Database. Results are stored as open-format pepXML and protXML files. We have additionally used Labkey server to convert these results into a simple .tsv file. Brief descriptions of the pepXML and protXML files is given below. NOTE: !PepXML and !ProtXML files are only generated for the iBAQ analysis workflow. '''Phospho Sample Preparation and Mass Spectrometric Analysis Workflow''' Our overall phosphopeptide process begins with the precipitated protein generated as described above. This precipitate was re-suspended and then split into aliquots. 0-3 aliquots were generated depending upon sample amount. As above, downstream analysis of these aliquots was performed independently. Consequently, they serve as sample preparation replicates (when possible), which we refer to as preplicates. Each preplicate was processed using our phospho-enrichment protocol to generate an enriched pool of phosphorylated tryptic peptides. Briefly, between 250 and 750ug of protein is digested with trypsin into peptides. These peptides are then run through a !TiO2 column to enrich for phosphopeptides and then a graphite column to clean and desalt the peptides. Enriched phosphopeptides were analyzed by a Thermo Orbitrap Fusion to generate a family of up to 3 preplicate raw files per condition. 3.3 Phospho Computational Analysis Workflow Phospho data was analyzed by Proteome Discoverer software [http://www.thermoscientific.com/en/product/proteome-discoverer-software.html] to match MS data to peptides and proteins from the Uniprot_HUM_031914_cRAP.fasta FASTA file. Proteome discoverer produces an .msf file which is converted to an easily readable tsv file. The .msf (Magellan Storage File) extension files are SQLite files generated by the Proteome Discoverer. The .msf format is another file format (like pep.xml, prot.xml) to capture all the relevant output information from the Proteome Discoverer search results. '''TMT Analysis Description''' The goal of Tandem-Mass-Tag (TMT) analysis is to provide deep relative quantification across conditions. Unlike the phospho and iBAQ studies where there are RAW files for each condition, !TMT data spans multiple substrates. 4.2 TMT Sample Preparation and Mass Spectrometric Analysis Workflow Our overall TMT workflow begins with the precipitated protein generated as described above. The precipitate for each condition was re-suspended and then split into 3 preplicate aliquots. For each preplicate, the 7 substrate conditions are digested into peptides and then labeled with a TMT reagent. For example, in the FIGURE: substrate S1, is labeled with TMT-127; substrate S2 is labeled with TMT-128, etc. Next, the 7 vials of labeled peptides are combined with a reference lysate (a pool of 4 cell lines) and two internal replicates to make a TMT-10-plex mixture. This 10-plex mixture is fractionated to achieve greater depth. The TMT-10-plex is fractionated by high-PH reverse phase fractionation into 8 reverse-phase fractions (RP Fractions). These 8 fractions are subsequently re-combined into 3 fractions. Fractions were re-combined for cost and time reasons. Recombination was done as follows: RP fractions 1,4 and 7 were pooled to make one pooled sample; RP fractions 2,5,8 were pooled to make the second sample; RP fractions 3 and 6 were pooled to make the third pooled sample. These samples were analyzed by a Thermo Orbitrap Fusion to generate three raw files per cell line preplicate. 4.3 TMT Computational Analysis Workflow TMT data was analyzed by Proteome Discoverer software (Link) to match MS data to peptides and proteins from the Uniprot_HUM_031914_cRAP.fasta FASTA Database. Proteome discoverer produces an .msf file which is converted to an easily readable tsv file. The .msf (Magellan Storage File) extension files are SQLite files generated by the Proteome Discoverer. The .msf format is another file format (like pep.xml, prot.xml) to capture all the relevant output information from the Proteome Discoverer search results. '''OTHER REFERENCES''' Proteome Discoverer User Guide [https://tools.thermofisher.com/content/sfs/manuals/Man-XCALI-97506-Proteome-Discoverer-14-User-ManXCALI97506-A-EN.pdf] Quantification with Proteome Discoverer [https://tools.thermofisher.com/content/sfs/manuals/Quantification-with-Proteome-Discoverer-1-2.pdf] Descriptions of proteome discoverer column results for phospho samples can be found at: [https://sites.psu.edu/msproteomics/tag/psm/]