This video covers the de novo sequencing and database searching, which includes automated summary reporting, de novo only findings, as well as the additional features of homology searching, multi-engine reporting and quantification.
In a typical proteomics mass spectrometry analysis, a researcher needs to identify all peptides that produce good quality MS/MS spectra, whether or not the peptides are in a protein sequence database.
PEAKS is designed to facilitate such analyses. To enable accurate and sensitive peptide identifications, PEAKS provides an integrated tool set that features:
- de novo sequencing: to identify novel peptides
- database search: to identify database peptides
- SPIDER: to find peptides with PTMs and mutations
- inChorus: to increase coverage by combining multiple database search engines
- PEAKS Q module for protein quantification (optional)
In this short video, let’s focus on a typical result that has been prepared with the newly improved PEAKS DB workflow that combines de novo sequencing and database search to increase accuracy and sensitivity.
PEAKS DB & De Novo
The result consists of a result summary, the identified proteins, the identified database peptides, and a list of peptides identified exclusively by de novo sequencing. In the first tab of the results, PEAKS summarizes the results in an easy to read summary view. The summary view is the central place for result filtration and validation. The specification of a filter is as simple as clicking this FDR button, selecting the false discovery rate on this FDR curve, and clicking the apply filter button. The result statistics are then updated dynamically.
First, this bar graph shows the score distribution of the target and decoy hits. PEAKS uses an enhanced target decoy method to estimate the false discovery rate. Here we observe that there are very few decoy hits above the score threshold, indicating highly accurate results.
Second, this scatter-plot compares the precursor mass error with the peptide score. If you are using a high-resolution instrument, you should see that the error is small for the high-scoring peptides and starts to scatter for peptides below the score threshold.
To see the proteins inferred from the identified peptides, we simply select the protein tab located on the left. Take a look at the proteins discovered, along with the peptides associated to each protein and each proteins coverage sequences.
To take a closer look at the peptides confidently identified from the sequence database we select the peptide tab from the left navigation. Reported here are each peptides -10lgP score, annotated spectrum match, ion table, and error map.
Detailed Examination and Navigation
PEAKS also provides very user-friendly ways for you to examine an individual peptide that has been identified. Let’s say I want to check a peptide that contains an oxidized methionine, which has a mass shift of 15.99 daltons. Go to the peptide view, sort by the PTM type, and those peptides with the oxidized methionine will appear at the top of the list. Select an interesting peptide and the spectrum annotation will appear at the bottom. You can conveniently zoom and navigate the spectrum by using your mouse wheel, or, you may focus on a particular area by dragging the mouse and selecting a smaller area.
De Novo Only
The final tab labelled “De Novo Only” displays the results that are identified exclusively by de novo sequencing. Since these spectra do not match any significant peptides hits in the database, they are particularly interesting novel peptides that no other software can find.
As mentioned before, PEAKS studio is an integrated proteomics toolset. In addition to the renowned de novo sequencing algorithm, and the newly improved PEAKS DB module, PEAKS studio also consists of:
- SPIDER – A homology search tool that is designed to match your de novo results directly to the database. This allows you to identify peptides even when you are working with an unsequenced organism or highly variable proteins.
- inChorus – Which combines multiple database search engines’ results to increase the coverage, while comparing different engines’ performance side by side. InChorus is able to simultaneously run and compare: MASCOT, XTandem!, Sequest, OMSSA, and of course PEAKS.
- Quantification module (Optional)PEAKS Q – an optional quantification tool, that quantifies proteins using all common MS and MS/MS labelling methods. These methods can include: iTraq, ICAT, SILAC, Label Free, and other user-defined labels.
PEAKS Identification Walkthrough
A quick walkthrough of how to set up, and interpret a PEAKS Studio identification run. It shows how to import FASTA databases from several different public sources. Then, gives descriptions and tips for all identification parameters. You’ll also learn how to interpret PEAKS DB, PEAKS PTM, SPIDER, and de novo sequencing results.
Welcome, this video will give a quick overview of PEAKS Studio Software. PEAKS provides you with an intuitive way to identify and quantify proteins using tandem mass spectrometry data. It combines clever algorithms that accurately identify and quantify proteins with an easy to use interface. When it comes to protein identification PEAKS has 4 main algorithms: de novo sequencing, PEAKS DB, PEAKS PTM, and SPIDER. In this video I will show you how to run these 4 algorithms from one search page, to quickly identify and filter protein identifications.
PEAKS can be used without a sequence database to perform de novo sequencing, but it is works best when configured with a good sequence database. So first, I’ll show how to easily configure a database. We have a configuration wizard that will guide you through the steps necessary to load a public database. To access it click window and then config wizard. The second page will list all the instrument vendors that PEAKS supports. Click the vendors that apply to you. The instructions for how to load that vendor’s raw data automatically will be given. Click next to continue to database configuration. Four public databases are listed here. Once you’ve selected your databases, click next and the databases will begin downloading automatically. Once the download is complete click ‘install’, then finish. The databases will then be ready for you when you set up a database search. If you already have a database you would like to search, click this configuration button, and then select the databases tab to configure your database. Use the fasta format drop down to select the database format. If none of the options match your format, the MSDB option works well with most databases. Click the validate database button and PEAKS will scan the database to see if it can read it.
You’re now ready to get started with your first identification run. To load data, go to the top left hand corner and select the new project button. This will bring up a page where you can add data files by clicking the add data button. One of the best ways to use PEAKS is to add multiple digests to increase coverage. In this case we’ll add a Glu C and Trypsin digest of the same sample. Add each file to a new sample with this button. Then specify the enzyme, instrument, and fragmentation type for each data file. Doing this helps PEAKS suggest some appropriate parameters for your instrument. Also, PEAKS uses machine learning based on your instrument and fragmentation settings to determine the most likely ions that appear in the spectra. This makes PEAKS more accurate and sensitive.
From here, you have two options. First, you can continue through the setup wizard by clicking data refinement. This will take you through PEAKS DB and quantification set up as well. By doing this you can set up all your parameters at once, leave the computer, and come back to your results. Or, you can click the ‘finish’ button to begin loading your data. In this example I will click the finish button because there are some important points I’d like to show you in the project tree.
Once your data is loaded, up in the top left corner you will see the project tree. Data that has successfully loaded into the project will appear with this green symbol. You can set up searches from here too. Click the project level then the PEAKS DB button to search all the data in the project together. This will give you one combined result for all data files in the project. Click a sample then the PEAKS DB button to search one individual sample. This will allow you to compare results from separate samples. Or, click individual data files then the PEAKS DB button. In this example we will select the project level because we want to combine our digests to get the highest coverage possible.
The first parameter window to come up will be data refinement. Here there are some optional parameters to select. You can merge scans with similar mass and retention times, correct precursor mass and charge states, or filter based on retention time and mass. Most of these are optional but we do recommend that you turn on precursor mass correction. Data refinement will also deconvolute and centroid the MS/MS scans and detect peptide features from the LC-MS information.
Next is the PEAKS DB parameters page. If it is your first time using PEAKS, the suggested parameters for your instrument will be given. If you are returning to PEAKS, your previous search parameters will come up by default. To select the default parameters for the instrument, select the drop down menu in the top right hand corner. First, put in your precursor mass error tolerance, since this is Orbitrap data we will enter 10 ppm. Next, enter the fragment mass error tolerance. We’ll select 0.5 Da in this case because the MS/MS scans were collected in the linear ion trap. Then set the enzyme rules. Since we specified the enzymes when setting up the project we can select ‘specified by each sample’. In this example, this will insure that trypsin rules will be used for the trypsin sample and glu c rules will be used for the glu c sample. The next enzyme options allow you to control the efficiency of the digest. Allow for an incomplete digest by letting one or both ends of the peptide disobey the enzyme rules with this drop down. Or allow for missed cleavages within the peptide here. Next set the PTMS with this button. The built in PTMS are separated into recent, common, uncommon, and artificial lists. Using PEAKS DB, it’s best to allow up to 10 variable modifications in a search. If you are interested in looking at more PTMs it is best to enter these into PEAKS PTM. In this case we will set Carbamidomethylation as fixed because iodoacetamide was used to remove disulfide bonds, and oxidation as a variable modification. You can also specify the maximum number of variable modifications per peptide.
Next, let’s talk about the checkboxes at the bottom, because some very important features can be enabled here. The estimate FDR button will create a decoy-fusion database. This means all proteins in the database will be shuffled and fused to target proteins. This allows PEAKS to accurately remove false positives by predicting the false discovery rate. The ‘find unspecified PTMs’ button will activate PEAKS PTM. By default it will search the 485 naturally occurring modifications in the unimod database. However, if you are only interested in a subset of those PTMs or custom PTMs click the ‘advanced settings’ button and select the PTMs you’re interested in. This is a very powerful feature of PEAKS, in this screen there is no limit to the number of modifications you can search. The third checkbox will activate SPIDER. SPIDER is a homology search tailored to de novo sequencing. It will find peptides that are similar to what’s found in your database but have one amino acid difference. With that in mind, you now know all you need to know to run an identification search with PEAKS. After these parameters are set, click ok, and four different algorithms will be used to search your data: de novo sequencing, PEAKS DB, PEAKS PTM, and SPIDER.
Now, let’s talk about interpreting the results. Notice how four result nodes have appeared in the top left hand corner below the project level. This indicates that all four searches are complete and your results are ready. If you click the SPIDER node, it will contain results from all four algorithms from all data sets in the project. So click here to get the most information. The first thing you’ll see is the summary view. This is where you can set your filters. We recommend clicking this FDR button first, it will bring up an interactive FDR curve. The x-axis shows the number of peptide spectrum matches sorted by -10lgP score. The y-axis gives the false discovery rate. Scroll along the graph to see the score and false discovery rate at each point. Most importantly, you can select one of the common cut offs along the right hand side. A 1% FDR is usually considered to be acceptable. Once you do this, the score where a 1% false discovery rate is achieved will be shown in the -10lgP score checkbox. For protein -10lgP, a threshold of 20 is accurate, and at least one or two unique peptides per protein indicates an accurately identified protein. Once you click apply filters, all results you see will be within these thresholds.
Next I’ll show you the protein tab. This is the most important page in your identification results. From here you can see all the proteins that were identified, their coverage and description. In the coverage view you can see the details of the peptides that support the protein identification. Each blue bar represents a distinct peptide different from all others matched to the protein. Click one to see the best spectrum that identified that peptide. Notice how multiple spectra can be grouped into the same peptide hit. XY or Z ions are shown in red and AB or C ions are shown in blue. Scroll over the peptide sequence to see which fragment ions are associated with which amino acid.
This can be very useful especially when considering modified peptides. Modifications are shown with a unique letter and colour. For example, this deamidation. If you click on the peptide, the modified amino acid will be shown with a lower case n. Scroll over it to see the high intensity fragment ions support this modification assignment.
This modification is considered to be confidently localized based on the fragment ions observed. If there is a pair of b or y ions showing fragmentation before and after the modification, the modification will appear above the protein sequence. Modifications without a confident localization will only appear below. We call this direct fragment ion proof. The threshold for this localization can be controlled to the right of the coverage view. Also, each modification is assigned an AScore, and this can be used as a cut off instead of fragment ion proof. The cut off we recommend for AScore is 20.
PEAKS is great at finding single amino acid variants with SPIDER. For example here, the T in the white background represents Threonine replacing Alanine, the amino acid in the database sequence. If you would like to manually validate this assignment, scrolling over the fragment ions that indicate the presence of a threonine show that this variant is highly likely due to the strong y-ion signal.
So far all the results we’ve seen were identified with PEAKS DB, PEAKS PTM, or SPIDER. Still, a current limitation of any protein identification search is that most LC-MS/MS data sets can only match a fraction of the MS/MS scans in a file to database peptides. So the question is, what is the source of the unidentified scans? This is where PEAKS shows its true power. Using de novo sequencing, it is able to give the most likely peptide sequence for every spectrum in the data file. The de novo only tab gives the de novo sequences for all spectra that could not be matched with PEAKS DB, PEAKS PTM, or SPIDER. You can tell if a de novo result is confident based on the colour. Amino acids above 90% confidence are displayed in red, above 80% confidence are displayed in purple, and above 60% confidence are displayed in blue. Use this button to set a confidence threshold, we suggest 80%. Then, sort by tag length to see the results with the longest string of confident amino acids. These will be your best de novo result. If they match a protein in the database with 6 amino acids in a row or more, an accession number will be given in the accession column. Click the proteins button to see where the de novo result aligns to the full protein sequence.
Now that you’ve reviewed your results, you’re ready to export them. Go back to the summary view and click the export button to share with your colleagues. HTML options are available as well as text exports that can be opened in programs such as excel, third party export options are also available for uploading PEAKS results to post-processing software.
Thanks for taking the time to check out this overview, for more detailed videos about specific features of PEAKS check out our website, www.bioinfor.com.
Webinar Video: PEAKS PTM Analysis
This is the video recording of the webinar hosted on March 16th, 2017.
Welcome to today’s PEAKS webinar. Previously my colleagues have introduced qualitative and quantitative proteome analysis in PEAKS. In addition, studies of post translational modifications (or PTMs) via mass spectrometry also constitutes a wide field of research and applications. Today I will focus on PEAKS PTM analysis by liquid chromatography coupled to tandem mass spectrometry approaches.
Compared to complete proteome analysis, major goals and challenges are quite different for PTM studies. The primary task is to identify modified peptides and proteins in a purified sample or complex mixtures. Each peptide representing a modification site of interest needs to stand on its own. Since modified peptides can be of lower abundance, it is more difficult to identify from their fragmentation spectra than non-modified forms. To overcome this problem, certain enrichment steps can be added to enrich for modified peptides or proteins. Secondly, the tandem mass spectrum must contain sufficient fragmentation information to localize the modified amino acids in a sequence. Various fragmentation methods can be employed to generate such informative spectra. PEAKS support different types of fragmentation methods such as CID, HCD, ETD, EThcD and etc. Besides identification, researchers often pursue to measure and compare relative abundance of modification site of interest across different biological samples or conditions. This can be achieved via label-based or label-free quantification methods, which is supported in PEAKS Q module. Lastly, it is also very useful to determine the site’s occupancy or stoichiometry (that means the proportion of proteins that are modified). As a data analysis tool, PTM functions in PEAKS tries to address these challenges and help your PTM studies going further. So keeping these in mind,
Here is an outline of the topics that I will cover in my talk today. First, I will show you some common practice for PTM identifications and modification site determination with PEAKS database search. Secondly, I am going to introduce PEAKS PTM function that is specifically designed to discover more unspecified or hidden modifications by integrating PEAKS database searching and de novo sequencing. Thirdly, I will demonstrate quantification of PTMs by using PEAKS PTM Profile. Lastly, I will touch a little bit on some new features that’s going to be integrated in the new version of PEAKS, which we hope can make your study easier.
Ok. Let’s start with the first topic: using PEAKS database search to identify post-translational modifications. Specifically, I will use 3 examples to demonstrate how to perform PTM analysis in PEAKS database search, with an emphasis on how the software addresses the two main challenges in PTM analysis, which are accuracy and sensitivity.
Let’s start with a simple case: modification analysis of a purified monoclonal antibody protein sample. For this experiment, we used hydrogen peroxide to treat a mAb sample and analyzed digested peptides by using LC-MS/MS. The purpose of the study was to determine the sensitivity of the mAb protein to peroxide. We uploaded the data of control, and peroxide treated sample into PEAKS, selected trypsin as enzyme and instrument and fragment type that were used in this experiment.
In database search setting window, we input precursor and fragment ion tolerance based on the instrument that we used, and added Methionine oxidation as an additional variable modification since we knew that hydrogen peroxide could oxidize methionine residues. And click “finish” button to let DB search run.
This is the identified protein results. In this protein coverage view, supporting peptides represented by these blue bars are mapped to the sequence of the selected protein. You can click the peptide, and then its MS2 spectrum will pop up, where you can easily examine the annotated raw data. Here is a peptide containing a confident methionine oxidation modification. The M in the peptide sequence is lower-case, meaning there is modification happened to this amino acid. There is also another unmodified counterpart identified.
Here I put two spectra of modified and unmodified peptides together, so that you can easily see the mass difference of y3 and y4 ions between the modified vs unmodified peptide spectra.
The second case I am going to show you is a bit more complex. Here I used a published data set. Phosphopeptides were enriched by using titanium-IMAC beads from a tryptic human cell line digest. Protein phosphorylation plays important roles in cell signaling systems and is a widely studied research area. Phosphopeptides, like other modified peptides, can be of lower abundance as I mentioned before. Therefore, the author used one of the common enrichment methods to eliminate unmodified peptides in the sample. In addition, they tested a novel fragmentation scheme, combining electron-transfer and higher-energy collision dissociation (termed EThcD) fragment method and found that EThcD generated richer and more conﬁdently identiﬁed spectra. The reference paper information is provided below. Now we have added EThcD fragment method in PEAKS, which will be available to users in the new version.
In the database search setting, we added serine, threonine and tyrosine phosphorylation as variable modification. We used Swiss-Prot human database.
After the DB search is done, here is peptide result view. The selected peptide contains two phosphorylated serine identified by this EThcD tandem mass spectrum. B/y and c/z ion series generated by EThcD helps to determine the phosphorylation site location as you can see from the tandem mass spectrum and the ion match table. Since there was a phospho peptide enrichment step, unmodified peptides were largely washed away and were not identified for this sequence. Although enrichment step can facilitate the identification of modified peptides, there are always cases where enrichment method is not available yet, and low abundant peptides bearing PTMs are present in a complex peptide background. So how does PEAKS perform under those difficult situations? Can we still identify these PTMs with high sensitivity?
In year 2012, we participated in a study hosted by iPRG of ABRF, which aimed to assess various softwares’ ability to handle these situations. They used a sample comprised of a mixture of synthetic peptides containing known PTMs spiked into a yeast whole cell lysate as background. And this dataset is the 3rd example that I will show you. Two major challenges of analyzing this dataset were: First, sensitivity: which means that, without enrichment strategies, samples are dominated by unmodified peptides, can softwares still identify modified peptides of low abundance? Secondly, if there are multiple residues within a given peptide sequence that could bear a particular modification type, can softwares correctly identify fragment ions to localize the exact site of modification? PEAKS demonstrated superior performance in terms of both the total number of correctly identified peptides, and the number of correctly identified modified peptides that were spiked-in. Now we can do better than this old version of PEAKS.
Here is an example showing a low abundant peptide with MS1 peak area of roughly 2E5. It has three potential phosphorylation sites in total. PEAKS successfully identified fragment ions that supported the phosphorylation on serine at position 3 and tyrosine at position 7 in this peptide.
Additionally, there were different combinations of PTMs occurred on this peptide sequence. PEAKS also identified other forms and localized the exact amino acids where PTM happened correctly. Speaking of the localization of modification site, which is another challenge in PTM analysis, I will briefly explain how PEAKS addresses this issue on my next slide.
The precise site of modification can be determined by the presence of site-determining fragment ions. In this example, the presence of the b11 and b12 ions in the tandem mass spectrum determines the deamination happens on asparagine 12. PEAKS provides two options for users to determine confident modification sites. One is to use minimal ion intensity, which requires that the relative intensity of the fragment ions before and after the modification site in a MS/MS spectrum must be higher than the number you put here to call a confident PTM site. The other option is called Ascore, which calculates an ambiguity score as -10 times the log (10) of a p-value and the p value indicates the likelihood that the peptide is matched by chance. Therefore the higher Ascore the better. Both methods provide measures of the confidence that can be placed onto the site localization. If the threshold of the method you select is met, then the PTM will be in a colored box annotated above the residue.
To summary, in section 1, I explained how PTM analysis is performed with PEAKS database search. However, the number of modifications specified by users in a database search is often limited. Otherwise it takes too long time for the algorithm to try all possible modifications for each peptide in the database. And the number of false positive hits dramatically increase. To solve this problem, we designed PEAKS PTM function.
PEAKS PTM function uses results derived from the powerful de novo sequencing algorithm and database searching. This is a typical data analysis workflow in PEAKS. First, de novo sequencing is carried out for each spectrum. Secondly, PEAKS database (or DB) searching is used to identify proteins. A few highly frequent PTMs can be specified at this step. Next, the spectra with high confident de novo scores but are not assigned by database searching are mapped against the identified proteins to find additional PTMs. At this step, you can specify as many PTMs as you want. This multiple-round search approach can help maximize the identification and sensitivity of you PTM analysis.
To enable this function, you can either check this “find unspecified PTMs and common mutations with PEAKS PTM” box in the database search setting window before you start data analysis. Or if you already have a DB searching result, you can select the DB node, and then click this PEAKS PTM button on the top panel. A PEAKS PTM window will pop-up, you can select either all built-in modifications, which are naturally occurring modifications found in Unimod database, or choose all possible PTMs you are interested in. As I said before, at this stage, the number of PTMs is not limited. You can select or inpute as many as you want.
If you still remember, at the beginning of my introduction on PEAKS database search, I used an example of methionine oxidation analysis of a monoclonal antibody. Here we turned on PTM searching to re-analyze the data by using over 400 built-in modifications. And we found that, another modification, dethiomethyl, occurred to that methionine at position 252 in addition to oxidation.
Furthermore, if you zoom out to look at the data, you will see many more modifications identified by the PTM search compared to the database search as listed in the legend and color boxed in the protein coverage view results. This is because PEAKS PTM function is specifically designed to discover unspecified or hidden modifications and maximize PTM identifications.
Knowing the type of modification is the first step. You may also want to measure the modified site abundance and compare the quantity between different samples and conditions. For this purpose, PEAKS PTM Profile can help to provide you with a direct visualization and summary of the quantitative information of the PTM sites identified in you studies. Two Examples will be used.
You can find this PTM Profile button on top of the protein coverage window.
By clicking this button, a PTM Profile window will pop-up, showing the percentage of summed modified and unmodified peptide feature abundance, the sequence of which contain the specific modification sites as listed on the X axis. You can choose which type of PTM you want to see. Here, this figure shows methionine oxidation quantity in hydrogen peroxide treated mAb sample. Two methionine modification sites were identified and quantified: position 252 and 428, the example that I used before.
You can also choose to display two samples in the bar graph. Now we can see that methionine at position 252 is more sensitive to hydrogen peroxide oxidation compared to methionine at position 428.
By turning on PTM search, we found more PTMs in this dataset. And for methionine at position 252, we identified dethiomethyl modification in addition to oxidation, which I have shown you before. So if you click the PTM Profile function again, you will see the quantitative result a bit different.
Now these two bars do not reach 100 percent because of the identification of dethiomethyl modified forms.
You could always export PTM Profiling results into csv files by clicking the “Save to Text Format” button on PTM Profile window. You can also get the detailed information of a specific type of PTM for all proteins if you check “Export all proteins” box here. Furthermore, if you check “export all fractions” box, you will get quantification result in all samples, although 1 or two samples data are displayed in the bar graph. Below I pasted the csv table exported from the oxidation profiling of mAb with PEAKS database searching and dethiomethyl profiling with PEAKS PTM searching results. You have all the information you need for further analysis in these tables, for example, summed modified and unmodified feature area in each sample, protein accession, peptide sequence, and the PTM sites.
Lastly, I would like to briefly show you some new features related to PTM analysis that will be included in the new version of PEAKS. We hope that these changes will help to make your analysis more easier in the future.
The biggest change is the result presentation view in PTM Profile function. In the current version, you can only compare PTMs in two samples. However, the next version will provide you with more comprehensive and detailed information and allow you to examine the whole dataset. First, on top left corner, you need to select a specific type of modification, and there will a PTM Profile table showing modified sites, summed intensities of modified peptide features containing the specific modification in each sample, summed intensities of unmodified peptides containing the specific modification site, and the best peptide score and Ascore. You can select to use either all peptides, or fully digested peptides only from here. Secondly, in the bar graph below, you can directly visualize the results shown in the table. The Y axis will show modification sites, followed by sample names. The X axis shows modified and unmodified form abundance. It will not be limited to two-sample comparison only, although I do have only two samples in this case. Thirdly, on the right hand side, the spectra with the best PSM score from modified and unmodified peptides containing the PTM site are shown, allowing you to check the raw data easily.
Surely, you can export the result to a csv file, which contains all the information listed in the PTM profile table, such as protein information, PTM type and position, best PSM score and Ascore, modified and unmodified area in each sample, as well as some more information, for example, the protein sequence containing the PTM site, and the best Ascore of modified peptides in each sample. Furthermore, you can export ptm profile for all proteins by checking this box. This is what is new for PTM Profile in PEAKS DB result.
Currently in PEAKS Q result, we do not have this PTM profile function. You may have noticed that this button is grey. However, recently, labeling-based methods such as TMT and SILAC have become more and more popular for PTM studies. In the next version of PEAKS, this function will be added to Precursor ion Quan and Reporter ion Quan for analyzing SILAC, TMT and iTRAQ types of data as well as label-free quan. Here I am using a TMT10-plex data as an example first. In the PTM Profile result from PEAKS Q, the table will show you quantity info in each group. For this experiment, I had two repeats of MS runs, each contained 10 samples labeled by TMT-10plex. And in the experimental setting, I specified each labeling channel from the 2 repeat runs as a group. Therefore there are 10 groups in total and the average value is calculated and shown in the table. Below in the heatmap, you can have a direct visualization of how the modified form abundance varies between different groups. Each row is a PTM site. The reason why we use modified forms only to generate the heatmap is because in PTM studies, researchers often try to find if the modification abundance changes between different biological samples or conditions. They care less about the unmodified ones. On the right hand side, only the modified peptide with the best PSM score is displayed. For this case, the tyrosine at position 2 is phosphorylated. The alanine at N-terminal is labeled by TMT reagent. Below, all peptides containing the selected PTM sites are listed in the peptide table. For example, here, the first peptide is a fully digested peptide, whereas the second one contains a miss cleavage. For each selected peptide, all features matched to it are displayed in the All matches table. Here ions of different charges, from different MS runs are displayed separately. And you can see the MS1 peak of each feature in the extracted ion chromatogram figure next to it. In the export csv file, more detailed information is provided in addition to what is shown in this table.
Here is another example showing SILAC type of data. In this experiment, I have 3 samples metabolically labeled with light, medium and heavy SILAC medium. I specified each labeling state as a group. The PTM Profile result view is similar to what I have shown you for TMT data. The only difference is that in this ion chromatogram figure, you can see XIC peaks for light, medium and heavy peptides since the quantification is performed at MS1 level.
To summary, today I have shown you the common practice for identifications of PTMs with PEAKS database search. Secondly, how to use PEAKS PTM function to discover unspecified and hidden modifications. Also the quantitative analysis of PTMs across samples by using PEAKS PTM Profiling function. And I have shown some new features related to PTM analysis which will be implemented in the next version of PEAKS. And we hope the information we provide you today can help you better perform your PTM analysis by using PEAKS software. Thank you for your attention.
Peptide Feature Intensities
PEAKS now incorporates quantitative information into your identification results. Our quantitative module has accurate and sensitive peptide feature detection that can be used to get the relative abundance of a peptide. By matching the peptide feature area to an identified MSMS, you can:
- Determine the most abundant peptides in your sample
- Obtain quantitative information on endogenous peptides
It is often very important to integrate identification and quantitative information found in proteomic mass spectrometry data. This is why we have integrated a tool into PEAKS that provides peptide feature intensity information for identified peptides. By doing this, you can get an idea of the relative quantity of a peptide in your sample. Here you can see a graphical representation of a full proteomic LC-MS run. It is clear that there is a specific group of likely peptides represented by the high intensity peaks seen here. PEAKS then answer the question of: what is the identity of those high intensity peptide signals?
It does this using a concept already used in label free quantification algorithms called peptide feature detection. A peptide found in a LC-MS experiment will appear in a predictable way. It will have a visible and predictable isotopic distribution resulting from different carbon isotopes, and its intensity will follow a gamma distribution across the retention time range in which it illustrates. If the signal from the mass spec h has these characteristics we call it a peptide feature. PEAKS will automatically detect these peptide features and calculate the area under the retention time curve. It will include the area of all isotopes associated with the feature within 5% relative intensity of the most intense peak. These areas are then integrated into an XIC curve shown here. From this the area under the curve can be easily calculated.
We then have a group of peptide features. If that feature is selected as a precursor ion for MS/MS, and then the MS/MS is identified we can link the two together. This is how we’re able to match peptide feature intensity with an identified peptide.
Viewing this information in PEAKS is very intuitive. Once you click on the peptide tab, the associated peptide feature intensity is found in the area column. This can be sorted to see the peptides with the highest intensity signal.
This information has been proven to be very informative. For example, in the publication shown here they reported the normalized area under the curve of peptide features associated with endogenous peptides. This gave the research group proof of the most abundant peptides eluted from their sample. We ran a subset of the data through PEAKS. What’s great is that was able to generate similar results with one click of a button! Sorting the peptide table by feature area gives you a clear idea of the most abundant peptides in the sample.
If you would like to validate the link between identified peptides and peptide features, it’s quite easy to do. Right click on a peptide in the peptide table and select ‘show spectrum in LC/MS’. It will bring you to the location in the LC/MS heatmap where the MS/MS event occurred. The identified MS/MS will be highlighted in red. This map gives a top down view of the signal coming out of the mass spec in terms of m/z, retention time, and intensity. Peptide features that are detected will be marked with a red circle. Scroll over the circle and a box will appear showing the detected range in which the peptide feature occurred. The area under the curve of the peptide feature will be displayed in the popup. This is the area we display in the peptide table.
You can even get a more intuitive, 3D view of the peptide feature by clicking on the 3D button in the top right hand corner of the pane. From this view, the peptide feature can be seen very clearly.
I hope this has helped you become familiar with peptide feature intensities in PEAKS. Thanks for listening. Subscribe to our channel to learn more about PEAKS, complete software for proteomics.
PEAKS Multi-Round Search
Often there are many spectra remaining after a database search that have promising de novo sequences, leaving you wondering what the source of these spectra are. We call these ‘de novo only spectra’. Multi-Round search gives you the ability to search only these spectra. This helps you:
- Remove spectra produced from contaminants
- Find endogenous peptides in a enzyme digested sample
- Refine search parameters to identify difficult targets
Multi-Round search is a very helpful feature found in PEAKS. It is designed to help you identify spectra unmatched by an initial database search; however they matched promising de novo sequences. We call these results de novo only spectra. These can be found in any identification result by clicking on the ‘de novo only’ tab. This tab contains spectra that could not be matched by PEAKS DB, PEAKS PTM, or SPIDER. However, using de novo sequencing these spectra can still have excellent peptide spectrum matches. This makes us question why these good peptides are missed by database search.
Multi-Round search gives us the opportunity to answer that question. It takes the de novo only spectra and separates them from the scans already matched using a database search algorithm. It allows you to search just those spectra with different parameters. So, you can choose a new database, different ptms, different enzyme cleavage rules, the possibilities are endless.
For example, take this spectrum from a human antibody sample. Searching it against Swissprot we are able to identify a peptide sequence but not very confidently. The peptide does not come from an antibody protein, it is a poor peptide spectrum match, and it has a low score below our 1% false discovery threshold. The de novo peptide spectrum match is much better. It explains the majority of the high intensity peaks with major fragment ions. Using Multi-Round search, this spectrum will be carried forward to the next round. We searched the de novo only spectra identified in the original Swissprot search against NCBI. With our example spectrum, an exact match to the peptide sequenced by de novo was found in an antibody protein. Here is a summary of the hits we were able to find with this spectrum. Searching Swissprot we identified an unexpected protein with a low score, likely a false positive. Now with the new Multi-Round search we achieve ideal results, where we identified a peptide from an antibody protein with a high score and no mutations.
One of the best applications of this new search type is filtering out contaminant proteins. For example, you can download a contaminant database like cRAP shown here. First do a search with the contaminant database. Here you can see that we were able to identify a few contaminants in this dataset. These scans will be removed from the Multi-Round search. Then when you search your target database, this will give you a list of protein identifications without contaminants. Another good benefit of Multi-Round search is limiting your search space. By filtering out identified spectra, you can limit your search space in order to identify difficult targets. For example you can identify endogenous peptides in a digested sample. First run a search using the enzyme you used to digest the sample. Then run a Multi-Round search with no enzyme to find endogenous peptides.
To actually set up a Multi-Round search, all you have to do is select an existing database search result and click the Multi-Round search button. This will bring up a new search parameters pane where you can select any new parameters you wish to search the de novo only results with. Keep in mind; the de novo only results are controlled by the filters you set in the summary view of the initial search. Scans with identification results below the database search filters, and above the de novo ALC filters will be included in the de novo only results. So, check to make sure you are satisfied with those settings before starting your Multi-Round search.
That’s all you need to know to get started with Multi-Round search in PEAKS. Thanks for listening! Subscribe to our channel to learn more about PEAKS, complete software for proteomics.
PEAKS PTM Profiling
Detect and quantify modifications with LC-MS/MS data and compare PTM profiling on proteins between samples.
PTM Profiling is a great new feature found in PEAKS 7.5. It is built to deal with a specific issue regarding post translational modifications. In many cases a protein is identified with both modified and unmodified peptides at some positions. PEAKS PTM profiling works with all PTMS, in this example we will look at phosphorylation. In this protein there are several phosphorylation sites. However, unmodified peptides were found at those positions as well. This leads to two questions. At what positions do the phosphorylations occur? And, if it is phosphorylated, how much of the protein is phosphorylated?
To answer the question of whether or not the PTM is true, PEAKS provides an Ascore for each proposed PTM. The A score is based on the evidence from the fragment ions, and is the probability that that modification is present at that position compared to other possibilities. PEAKS also has the ability to assign PTM confidence based on direct fragment ion proof. So, the MS/MS spectrum shows fragmentation before and after the proposed modification site above 5% relative intensity, it is considered confidently modified. In either case, if the PTM is considered confident it will appear above the protein sequence.
To answer the question of how much of the protein is phosphorylated, PEAKS uses a concept implemented in label free quantification experiments. It uses the concept of peptide features, meaning the lcms signal of a peptide. It has been proven that the area under the curve of the peptide feature is proportional to the relative abundance of that peptide. So, for a peptide with a confident modification site shown in this LCMS, PEAKS will find the area under the curve of its associated LCSM feature. It will repeat this as well for all of the modified and unmodified peptides found at this position in the protein. This table shows all the modified and unmodified peptides which were found at this position. The Ascores are reported for each modified peptide. And, the peptide feature area is given as well.
Using this information, PEAKS creates a bar chart that gives a ratio of the relative quantity of phosphorylated peptide versus unphosphorylated peptide at each identified phosphorylation position. Only fully digested peptides are used in this chart to give added accuracy. Scrolling over the graph will give the percentage of the total amount. You can also use the drop down menus at the top of the window to compare the phosphorylation ratios across multiple runs. Here you can see at some positions there is consistency across runs. At other positions the modification didn’t appear. So, you can see the similarities and differences in phospohrylation across multiple runs. You can also export the raw ptm profiling data to get more details.
To actually run PTM profiling is easy, at the top right hand corner of the coverage view, select the PTM profiling button. This will compile the data and present the PTM profiling data for all the identified modifications in that protein. Only confident PTMs are used, so be sure to select either A score or minimal ion intensity and the desired cut off using the legend to the right of the coverage view.
That is all you need to know to get started with PTM profiling with PEAKS. Thanks for listening. Subscribe to our channel to learn more about the features of PEAKS complete software for proteomics.
Peptide De Novo Sequencing
Check out this video for a more in-depth analysis on PEAKS’ peptide de novo sequencing. This video will give a brief overview of how peptide de novo sequencing can be useful, as well as give a demonstration on how to perform a de novo analysis using PEAKS.
Welcome to the PEAKS Peptide De Novo Sequencing tutorial. In this video I will outline the benefits of de novo sequencing and how it is a part of PEAKS. I will then show you how to perform a de novo sequencing analysis using PEAKS, which is the most widely accepted tool for peptide de novo sequencing in mass spec labs.
For peptide identification with tandem mass spec, de novo sequencing derives the peptide sequence without using a protein database. This can arise when researching unsequenced organisms, antibodies, endogenous peptides, and peptides with unexpected PTMs. Even when a sequence database is available, a database search engine can fail to assign database peptides to many high quality tandem mass spectra.
Researchers choose to use peaks because it is fast and most importantly, accurate. For example, in this third-party comparison of de novo sequencing algorithms, PEAKS outperformed all other algorithms compared in the paper.
Customer Testimonial (Outstanding User Interface)
PEAKS makes understanding de novo results easy, with an incredibly user-friendly interface. One publication in particular states:
“An important factor when performing large-scale de novo sequencing experiments is the ease of use and flexibility of the software. In this respect, PEAKS, being a commercial quality program, was far superior and offered the most adaptive interface, with the ability to import various formats of data from a vast array of mass spectrometers. The sequencing result displayed by PEAKS was also considerably more useful.”
Saves Time and Easy to Use
Using PEAKS is as simple as 1-2-3, as I will demonstrate for you now.
- Select you data
- Click the de novo sequencing icon
- Specify your parameters, such as the error tolerance, enzyme, and PTM’s
PEAKS will then de novo sequence all of the tandem mass spectra in the data set, at a speed of up to 15 spectra per second on a regular PC, and even faster on a server.
Viewing De Novo Sequencing Results
Once the de novo sequencing process is finished, a de novo result node will appear below the selected data. It looks just like a small snow-capped mountain with “dn” in letters. Double click the node to open the result. The de novo sequences are listed in a table, along with the associated score and retention time for each sequence.
Selecting a peptide, will display the matching peptide-spectrum details, such as the spectrum annotation and the ion match table. You can easily zoom and navigate in the spectrum annotation using your mouse. This is an important feature if it is desired to examine individual peptides and as such there are several interesting ways to navigate the spectrum. Here are some examples:
- drag to zoom
- Use your scroll wheel to zoom into the y axis
- Use your scroll wheel to zoom into the x-axis
Local Confidence Scores
To make things easy, individual amino acids are colour-coordinated, based on their respective confidence level. The confidence value can be examined by hovering your mouse over a particular peptide, red being above 90% is considered a great score, purple being between 80-90% is good, and blue representing 60-80% is acceptable. Anything below 60% is coloured black. This local confidence on each amino acid is a unique feature of PEAKS. It allows you to adjust the minimum local confidence threshold to convert the de novo sequence into a sequence tag that only contains the highly confident amino acids.
But what we really want to look at is the total local confidence (TLC) and average local confidence (ALC). The TLC score, indicates the expected number of correct amino acids, while the ALC score, indicates the expected percentage of correct amino acids. For example, here we have a TLC of 10.9 and an ALC of 84%, which indicates a confidently identified sequence.
While the default is to sort the table by TLC, you can sort by other columns by clicking the column title, or you can use the search function to quickly locate a peptide.
Summary View – Exporting Results
This summary view shows the result statistics. If you like you can filter the results by setting a score threshold. We recommend to start with the default TLC and ALC values, and adjust those values manually by examining the de novo sequences around this threshold.
The de novo sequencing results can be exported to text formats if you want to use PEAKS as a subroutine in your lab’s own workflow. To export the filtered results:
- Click “Export” at the top of the summary view
- Choose the format you wish to export the results in, available formats include html, csv, or xml format
- Choose the location and directory name where you want to put the exported files
- Click OK
De Novo Sequencing is just the Beginning – PEAKS Workflow
For the analysis of mass spectrometry data, PEAKS does not stop at de novo sequencing. Instead, the de novo sequencing capability facilitates PEAKS to provide a number of unique benefits through several integrated tools. De novo sequencing is just the beginning.
First, the de novo sequencing results are used to confirm and improve PEAKS’ integrated database search. As a result, the performance of PEAKS is well above other database search software. For when sequences are not found by the database search engine, the peptides found exclusively by de novo sequencing can be analyzed by the integrated PTM finder and SPIDER tools, in order to locate unexpected PTMs and peptide mutations.
With PEAKS, you are not limited by only using a sequence database, which may be unavailable, incomplete, containing errors, or ineffective due to unexpected PTMs and mutations. PEAKS has all of the necessary tools to overcome these challenges, due to it’s unique de novo sequencing algorithm.
PEAKS DB: Peptide Identification
More than just de novo sequencing, PEAKS provides a sensitive and accurate tool for identifying known peptides and proteins. In this video, users will learn why the hybrid approach (de novo + database searching) is the optimal method for identification.
Welcome to the PEAKS Database Search Tutorial on peptide identification. In this video, I will be going over the benefits and features of the newly improved PEAKS DB search engine, and demonstrating how to perform an analysis.
PEAKS DB: De Novo Assisted Workflow
As this diagram illustrates, when you run PEAKS DB on raw MS/MS Spectra, the spectra are automatically de novo sequenced and the results are automatically combined with the database results. This gives you improved database results and can differentiate sequences exclusively identified by the de novo analysis. These exclusively de novo sequenced identifications can be potential novel peptides, or peptides with mutations and PTMs.
Configure a Database
Now let’s take a closer look at PEAKS DB by performing a live demo of the workflow. Before you run PEAKS DB you need to make sure you have a database configured. It is only necessary to configure once. Step-by-step instructions on how to configure a database can be found in the PEAKS “Help” menu under “Help Contents”. The instructions are in Section I-Part 6 under “Configuring Sequence Databases“
How to Run PEAKS DB
With the new interface it is easy to run a PEAKS DB workflow:
- Select your data
- Select the PEAKS DB icon
- Set the study parameters such as error tolerance, enzyme, fixed and variable PTM’s, as well as the desired database.
- Once finished click OK.
One of the best features of PEAKS is the improved summary view. In this view you can easily filter and validate your results, as well as to get an overall understanding of the identifications.
To filter your results:
- Select FDR from the toolbar
- Navigate along the curve to find the desired percentage, we usually recommend using a 1% FDR
- Click “Apply Filters”
In the Results Statistics section, users are able to visualize a graphical analysis of all peptides. The first figure summarizes the number of peptides spectrum matches, or PSMs, that are identified at the set FDR. Below, the figure on the left summarizes the number of identified PSMs and displays both the target and decoy identifications at each -10lgP score. The figure on the right shows the precursor error distribution. The error is small for high scoring peptides and scatters for those identified below the score threshold.
In the Experiment Control Section, the two figures can help check whether the instrument is well-calibrated. The left graph illustrates the distribution of the precursor mass error, where a distribution around 0 indicates a very well calibrated instrument. The right graph further plots the precursor error distribution against the precursor m/z
Once you have finished validating your results, it’s easy to export your results in a variety of formats. For example you can export to html, so that it may be integrated into a website. To do this:
- Select Export
- Choose html, additionally you can choose to export in csv, fasta, and xml formats
- Choose the location
- Click Export
The generated files can be viewed with a web browser, which makes it exceedingly easy to share the results with a colleague or submit the results to a journal.
In the Protein view we have a great view of each of the identified proteins. For each Protein it readily displays the associated -10lgP score, coverage percentage, number of peptides, number of peptides unique to that protein, and a brief description of the protein in the top pane. To take a closer look at a protein, select the protein and the associated peptides are displayed in the lower pane. You are also able to take a closer look at the protein’s coverage map by selecting the coverage tab in the lower pane.
In the peptide view we can see all of the identified peptides along with important details such as the -10lgp score, mass to charge ratio, and the Accession protein. Using the search box here, you can search a particular peptide by the scan number, peptide sequence and so forth. Or, you can sort the peptides using the column header, such as by PTM. This allows you to bring all peptides with a particular PTM to the top of the table. By selecting a peptide you are able to see its associated annotated spectrum, ion match table, and error map.
There are a couple of options available to refine the spectrum to display the information you desire. First there is an option to filter specific ions to display, found by selecting the wrench tool in the middle pane here. As a default, all b and y ions are selected, however to change this click the respective ions to add or remove from the spectrum. Once you have the desired ions displayed in the spectrum you can then zoom into an interesting area. To focus on a specific area click and drag your mouse, or you can use your mouse wheel to zoom into the x or y-axis. Double click to return to the original ratio.
De Novo Only View
The De Novo Only view displays all peptides that were not found in the database. To learn more about de novo sequencing results check out our PEAKS Peptide De Novo Sequencing Tutorial.
PEAKS Q: Label Free Quantification
In addition to protein and peptide identification, PEAKS excels at accurate label free quantification. This video predominantly uses slides to illustrate the fundamentals of the method.
Welcome to PEAKS label free quantification tutorial. In this session, we will go over the features and benefits of the software tool for label-free quantification and demonstrate how to perform data analysis.
Workflow in PEAKS LFQ
The PEAKS label-free quantification algorithm is intensity-based. As this diagram shows, the survey scans use peptide feature for quantification. MS2 scans are used for peptide/protein identification. Combining them produces peptide and protein quantification results. Let’s get into more detail for each steps.
LC-MS Heat Map
In shotgun proteomics, proteins are digested into a complex mixture of peptides, which are separated by on-line HPLC. At a given retention time, the fractions of the mixture eluted from the column are sent to a mass spec instrument and their precursor masses and intensities are recorded in a survey scan (the MS1 spectrum).
This figure shows a heat map of the mass spec signals generated by peptides eluting from the column. The map depicts all peptide features detected by the instrument, with the complexity of elution and isotope patterns.
The intensity of a peptide feature is proportional to the abundance and concentration of the peptide in the sample.The abundance ratio of a peptide between two samples can be estimated by the intensity ratio of the peptide feature in two heat maps.
There are several steps to determine the relative abundance of a peptide and protein by label-free quantification. The first step is “feature detection”.
A peptide feature is defined as a group of peaks in a heat map, characterized by eluting pattern in terms of retention time and isotope patterns in terms of mass charge.
The deconvolution of overlapped peptide features and retention alignment between runs are the key factors for the data analysis; for the overlapped peptide feature clusters cannot be avoided even with today’s high resolution instruments and LC separation techniques. PEAKS Label Free quantification successfully deconvolutes overlapped peptide features by using an expectation-maximization algorithm.
RT Alignment and Feature Matching
The second step is Retention time alignment and feature matching.
The retention time of a peptide feature in two LC-MS runs may changes subject to the LC column conditions, and so forth.
To match the same peptide features in different runs, retention time alignment is required. Here are two LC-MS runs, you can see where the retention time changed.
After alignment, the peptide features are matched.
Next step is ratio calculation. The relative abundance ratio is calculated by the area of the extracted ion chromatograms (XICs) in two runs.
In each scan, intensities of isotopic peaks are summed when the XIC is generated.
Here are two XICs of a peptide feature, the red one is from run 1, and blue one from run 2. The abundance ratio can be estimated by the ratio of areas of two XICs.
Next we make a significance assessment. Technical replicates are used to evaluate the variation of a feature between runs. A quality value is associated with a feature in terms of its intensity, isotope and eluting patterns. The feature quality is defined as 1 log (sigma), where, Sigma is the average variation.
Given the observation of a feature variation in two biological states, a significance value is calculated, which is defined as -10logP, where P is the P-value to observe such variation in the replicate runs.
The last step is peptide feature identification. This is done using the MS/MS spectra associated with the feature. PEAKS label free quantification is seamlessly integrated with PEAKS database search for peptide identification, thus, data analysis of label-free quantification is much easier then switching software or exporting from one format to another.
PEAKS Q: Label Free Quantification – Live Walkthrough
This video skips ahead to the question of “How do I analyze a set of data using PEAKS?”. Basics of the software are addressed quickly as to maximize attention towards result explanation and value. We even explain how to use PEAKS label-free quantification to identify peptide features without database peptide identification.
I will now demonstrate how to use PEAKS label-free quantification tool with the real dataset. The first step is to build a PEAKS project for the experiments.
Give a name to the project and specify a location to save the project.
Click the button of “Add data” to add the data.
Navigate to the LC-MS/MS data location, and select the data to be analyzed.
Select a file or files: use the add file button, and give a name for the file to show in the result.
Set enzyme, instrument, fragmentation and continue until all files are added.
After this, the data selection part is finished.
“correct precursor” is checked with “mass only” option for Orbitrap instruments.
Precursor mass error: 20 ppm, Fragment mass error: 0.5 Da
Modification: Fixed: Caramidomethylation
Var: Oxidation on M & Deamidation
Database: Swiss-Prot; Taxa: mouse
Complete identification (DB + PTM + SPIDER)
Label-free quantification parameters:
Estimated peptide precursor m/z shift between samples: 20 ppm
Estimated retention shift of a peptide precursor between samples: 5 minutes
The “Sample Groups”: each sample ran in triplets. Those replicates are grouped.
We have now finished setting the parameters and are ready to import data then perform the analysis. Click “Finish”.
Analysis & Results
After completing analysis, identification and quantification results were present.
Click node 11 and the details of quantification result will be present.
Protein Heat map shows protein abundance profiles among all samples.
These proteins are up-regulated in diseased samples. And their abundance is consistent among replicates.
These proteins are down-regulated in diseased samples. Also their abundance is consistent among replicates.
This is the volcano plot of quantified proteins, which plot significance versus fold-change for proteins.
Three groups: the Majority of proteins are 1. background proteins, 2. Up-regulated proteins which have sufficient up fold change and large significance, 3. And down-regulated proteins which have sufficient down fold change and large significance.
Here are the histograms of feature retention time shifts and precursor m/z shifts, respectively. The red one identifies the shift before retention time alignment and blue one is after alignment.
On the top, there are 2 sets of filters: peptide feature level and protein level.
Significance of peptide feature, fold-change of feature, quality of feature, intensity of feature, etc are used to filter out the peptide features. Protein significance, protein fold-change, and a number of unique peptides in a protein are used to filter out the proteins.
In the protein tab, a list of quantified proteins were displayed, sorted by significance. For each protein, its abundance profile in all samples and groups, etc are displayed. The coverage map shows quantified peptides supporting to this protein.
The details of each peptide feature is listed in the “Feature” tab. The top three peptides with the highest intensities are used for protein ratio calculation. PEAKS also provides peptide feature quantification. The Feature tab shows the list of quantified features.
For each peptide feature, the XICs in all samples were shown, and the integration areas of its XICs are listed in the table.
The characterization of the feature in all samples are shown in the “Sample Feature” tab. The red line shows the boundary of the feature, the blue square is MS/MS spectrum providing peptide identification.
It also provides 3-D view. Press “Ctrl” button, using wheel of mouse to adjust.
No ID? No Problem!
PEAKS can quantify peptide features without the pre-requisite of peptide identification, by un-checking “With peptide ID”.
Here is one example: there is no database peptide hit for this feature. PEAKS DB provides more answers by integrating with de novo sequencing. Let’s look at the de novo sequencing result from PEAKS database search.
Open SPIDER node 10. Select “LC-MS” tab, go file of SMA 1.
LC-MS heat map: PEAKS associates all identification and quantification results with peptide feature in the heat map.
Blue squares indicates the MS/MS spectra.
Solid blue squares are the MS/MS spectrum with a confident database peptide hit.
Solid orange squares show the MS/MS spectrum without a confident database peptide hit, but with a confident de novo peptide sequence.
The red circles are the detected features. Solid ones are the confident features which satisfy the filters. Our purpose is to see if the feature (514.7 17.03) has a de novo sequence. Note that the heat map can zoom. In this case, we use the search function. PEAKS will automatically zoom to corresponding window in the heat map.
PEAKS DB provides more answers by integrating with de novo sequencing. Let’s look at the de novo sequencing result from PEAKS database search. Here is one example where there is no database peptide hit for a feature. That said, there is a de novo sequence: ALNAAGASEPK.
When we BLAST it, and we find that it is a peptide from Myosin-binding protein C, which is a myosin-associated protein found in the cross-bridge-bearing zone of A bands in striated muscle. Myosins comprises a family of ATP-dependent motor proteins and are best known for their role in muscle contraction. With this finding, we get a new biomarker candidate by using de novo only peptides.
You will also note that PEAKS supports the Html format for easy web viewing and text format for sharing and down stream analysis.
PEAKS Q: Labelling Quantification – SILAC
PEAKS Studio 8 provides our improved precursor ion quantification module. In this video, the improvements we have made to this labelling quantification option, specifically, SILAC, will be discussed using data from a recent publication.
Welcome to today’s tutorial. Today, I will be describing the features of our improved precursor ion quantification module in PEAKS 8. In this video, I will discuss the improvements we’ve made to this labelling quantification option, which applies to such experiments as SILAC, using data from this recent publication.
How Can PEAKS 8 Help?
PEAKS 8 offers enhanced computational features, which ensures more accurate and reliable SILAC quantification results. Firstly, the algorithm has been improved such that PEAKS 8 detects, highlights, and quantifies peptides at the feature level. Additionally, it provides more manual control for users, enabling the removal of peptides that are deemed to be less confident quantification candidates; an assessment that is based on characteristics of the feature vector labels. Finally, PEAKS 8 employs a computational solution to correct for the problematic conversion of heavy free arginine to proline. Overall, this will provide improved reliability and confidence in the results from your analysis.
How to Set Up An Analaysis
SILAC quantification results can be generated by employing the easy-to-use workflow in PEAKS, which can be initiated just by starting a new project. Select the “New Project” folder icon and add your raw data files from the Project Wizard screen. As PEAKS 8 allows you to analyse individual, as well as grouped, raw data files, we recommend adding each data file as a separate sample using the multiple beaker button. After you’ve added your data, simply choose your enzyme, instrument, and fragmentation option, and then move on through the workflow. While the workflow will take you through data refinement and identification, the focus of this tutorial will be on setting up a quantification project. For SILAC experiments, select “Precursor Ion Quantification”, choose your experimental method, and apply your chosen filters. You can then choose to group replicate data before you initiate your analysis. Finally, apply the R-to-P correction to ensure that peptides with divided heavy feature intensities are added together before PEAKS calculates the peptide ratios.
The PEAKS 8 Peptide view makes it very convenient for users to identify confident peptides. For each peptide, the view is divided into four quadrants: an MS2 scan, a survey scan, the LC-MS view of the peptide’s labelled features, and a pane that combines an extracted ion chromatogram with an isotopic distribution diagram. These four panes apply specifically to a selected peptide feature vector, which you’ll see when you click the “All vectors” button. Clicking “All vectors” will generate a table, which shows all identified features. By default, the feature vector with the highest quality is initially chosen.
Filtration Options: Summary View
Before examining your data, it’s best to select an appropriate set of filters. To change the feature vector filters, select the “Edit…” button in the second row of the filtration options to open the Filtration window. Set a -10lgP threshold through the drop-down menu or by clicking the false discovery rate button. Choose a quality threshold, the score for which is based on a comparison between feature vector characteristics; which include m/z and RT differences, XIC shape similarity, and feature intensities. Set average feature areas and a charge range. Finally, choose whether or not you only want to display peptide features that include a reference label; additionally, the minimum number of labels that must be present. This reference label can be set within “Experiment Settings” at the top of the filtration options.
You can also choose which proteins you want to filter from your display by selecting the “Edit…” button in the third row of the filtration options to open the Protein Filtration window. Set protein significance scores by choosing either an appropriate significance threshold or by setting a Benjamini-Hochberg FDR. Being that significance scores are -10lgP scores, if you choose to set a threshold, we recommend setting it to a value of 20 as this corresponds to a p-value of 0.01. Select a protein fold change cut off and choose the minimum number of peptides that you want to include. Additionally, choose to exclude variably modified peptides from quantification and decide if you want to calculate significance scores using the ANOVA or the PEAKS Q method.
As mentioned, PEAKS 8 SILAC quantification is heavily feature-based. The LC-MS and XIC panes show you the characteristics of the feature vectors used in peptide quantification. The LC/MS highlights and zooms in to where the feature vectors are located. The information displayed from the pop-up windows that appear when you scroll over the feature markers in this view, is also displayed in the feature vector window. As such, PEAKS 8 enables easier viewing of the labeled feature vector characteristics. Ultimately, the ratio of the quantified peptide calculated using the average light and heavy labels of every identified feature vector.
The ratio of the protein to which a peptide identifies is calculated from ratio of the light and heavy labels, where the labelled areas represent the sum of all peptide areas that were used in the analysis.
Managing Quality Control: Removing Less Confident Peptides
PEAKS 8 allows users greater control to decide which peptides are used in protein quantification. If you identify a peptide that appears to be a poor choice in your quantification calculation, simply uncheck the peptide from the analysis. Select the “Apply”, which is found in the filtration settings within the Summary view, to incorporate this change. When you return to the quantification pane, the area values of the protein will now only reflect the sums of the remaining check-marked peptides.
Isotopic Distributions and Peak Height
Apart from the sample profile, the XIC also provides insight into how good of a candidate a certain peptide may be for quantification. An isotopic distribution diagram and an XIC, for both labels, is displayed in the bottom-right pane of the Peptide view. Feature vectors with well-aligned peaks and well-matching distributions are better candidates for quantification, while feature vectors with poorly-aligned peaks and poorly-matched distributions are less confident candidates.
Computational Solution for Arginine to Proline Conversion
PEAKS 8 provides a computational correction for the problematic arginine to proline conversion in heavy labelled peptides. When the R-to-P correction is selected back in the Quantification set-up, the total intensity of the heavy feature label is recognized, as seen in the LC-MS view where two heavy regions are highlighted. PEAKS 8 will then sum the areas of the two heavy labels and display the combined value in the feature vector table. This ensures that the final ratio of the peptide’s associated protein is not affected by any conversion of arginine.
Overall, PEAKS 8 offers many improvements to the previous PEAKS 7.5. This includes an easy-to-use quantification workflow, multiple options for filtering your results, ratio calculations based on the areas of the labeled feature vectors, a better display that includes options for more control of what peptides you want to examine and include in the ratio calculations, and an accurate arginine to proline conversion correction.
Thank you for listening. If you’d like to try PEAKS Q with your own data, you can request a demo at biosoft.ca. Also, subscribe to our channel to learn more about PEAKS, complete software for proteomics.
PEAKS Q: Labelling Quantification – TMT/iTRAQ
PEAKS Studio 8 now contains an excellent tool for quantification by isobaric labelling methods such as TMT and iTRAQ. This video will highlight some of the benefits of this tool and how to use them for your research.
PEAKS now contains an excellent tool for quantification by isobaric labelling methods such as TMT and iTRAQ. This video will highlight some of the benefits of this tool and how to use them for your research.
Accuracy and sensitivity is a main focus of this tool. There are three important points to keep in mind to insure accurate and sensitive isobaric labelling results: supporting the most accurate methods such as multi-notch MS3, using computational methods to select only high quality spectra for protein quantification, and reliable protein significance prediction. The next important point when it comes to TMT and iTRAQ is scalability. Currently, the largest number of samples you can use in a single experiment is 10. If more samples are required in the study, multiple experiments must be compared. We will discuss how this can be done with PEAKS.
Accuracy and Sensitivity for TMT/iTRAQ
One of the main problems with isobaric labelling is interference. MS2 spectra can contain signal not only from the target precursor ion, but also interfering contaminants in the sample. Since the whole sample is labelled with the quantification tags, there is no way to separate reporter ion signal from the target and the contaminants. This is simulated by the experiment described here. A yeast digest was labeled using TMT 6plex labels in a relative dilution curve forming the ratio: 10 to 4 to 1 with the three lighter labels and back to 10 using the heavier labels. Human cell line was labelled as well, only using the 3 heaviest labels in the 6plex set to simulate interference. These figures show that using the typical MS2 quantification method the heavier label ratios of the yeast cell lysate would not follow the expected ratio due to human cell line interference. With multi-notch MS3 quantification the resulting MS3 spectrum was observed to produce ratios that closely matched the expected ratios of the yeast digest. So interference was greatly reduced.
PEAKS allows you to analyze multi-notch MS3 data easily. In this example, an 8plex experiment was set up where E.coli digest was set up to follow a 10 to 5 to 2 to 1 using the lighter labels and the reverse for the heavier labels. The four heavier labels were used to simulate contamination with the human cell culture. This is how the results appear in the PEAKS heatmap. Red indicates up regulation, green indicates down regulation. With this type of display it is easy to see that the E.coli proteins follow the expected dilution curve and the human proteins show intense signal in the heavier channels and almost no signal as expected in the lighter channels. We’ll now talk about how to set up this kind of data in PEAKS.
Setting up the project is easy. Click the create project button indicated here and add the data. Enter the enzyme, instrument, and fragmentation type of the MS2 scans. You can then click the data refinement button to proceed through the workflow. When setting up identification parameters it is not necessary to add the labelling tag as a fixed modification if using the workflow. Once you select your method in the quantification step it will automatically be added to the identification search. However, if you are not using the workflow you must remember to add it as a fixed modification. During quantification, first set up first your labelling method. This will make all of the labels appear in the experiment groups. Select all samples and add them to the right with this button. Select the mass error tolerance, in the case of MS3, high resolution mass spectrometry is typically used so a tight error tolerance can be given. Select the mass spectrometry level where the reporter ions should be found. Then select the identification cut off method. This is important for insuring that only high quality peptides will be used for protein quantification. If you chose to use a decoy database, a 1% FDR cut off is recommended. You can then click finish and let PEAKS work! It will create quantification results without any more input.
Selection of High Quality Spectra
Another important step in ensuring accurate and sensitive quantification results is selecting high quality spectra. An easy to use display is essential for this so that manual inspection can be used to ensure that the quantification results are reliable. PEAKS does this by putting all the important information in one display. This is the peptide view where all the peptide quantification info can be seen in the top pane. The identification result can be seen in the middle. Then a view of the quantification labels can be seen at the bottom. If MS3 was used, this will be the MS3 scan. If MS2 was used, a zoomed in view of the labels in the MS2 scan will be shown. Also, filters can be used from the summary view to select high quality spectra using this edit button. Identification quality plays a major role in quantification, so set an identification -10lgP threshold. PEAKS also calculates a quality score, which considers identification -10lgP, noise around reporter ions, and mass error of the reporter ions. Higher intensity reporter ions are also more reliable, so an intensity threshold can be set. You can set a minimum number of channels as well to prevent missing values from affecting your protein quantification results.
Protein Ratio Estimation
Once these filters are set the protein display will only show supporting peptides that pass these filters. These are the high quality peptides that are used for protein ratio calculation. You also have manual control, click the checkbox in the used column to remove a peptide. This will remove it from protein ratio and significance calculation.
Quantification significance is calculated at the protein level. Select either the ANOVA or PEAKS Q significance options. Either one is a calculation of the likelihood that the observed change between conditions is significant. In either case, the -10lg of the p-value is used. So, a cut-off of 20 is suggested. You can also select a Benjamini-Hochberg cut off. Select the modified exclusion checkbox to exclude peptides with variable modifications from protein ratio calculation. Modified peptides have different ionization efficiency than unmodified ones, so we give you the option to exclude them to avoid this from having an effect on your quantification results.
The end result is a confident list of protein quantification results that can then be exported and shared with your colleagues from the summary view.
Combine Multiple Experiments
Now let’s talk about another major problem with isobaric labelling, multiplexing. Since the largest experiment you can currently run is a 10plex experiment, you are limited in the number of samples you can use. So, if more samples are required, more experiments are required. This is when a global reference standard should be used. For example, here 131 is used as a reference to link experiment 1 and 2. 131 in samples 1 and 2 are replicates, so the abundance of the peptides in these two channels should be similar. This means they can be used for inter experiment normalization. Intra experiment normalization methods are provided as well. In this case, two 6 plex experiments were combined together. This allowed us to clearly find several proteins that were consistently differentially expressed between the two experiments.
Inter Experimentation Normalization
If you are using this type of experiment it can be configured once quantification is complete from this experiment settings button. From this page select the ‘all experiments’ from the select experiment drop down menu. Select the ‘perform inter experiment normalization’ checkbox, and add all but the reference channels to the right in the experiment groups section. This will insure that only the experimental samples will be shown in the heatmap. In this case, select 131 as the spiked channel for both samples. Then, click the ‘exclude spike channel for significance’ button. The reference channels are not expected to change between experiments and significance is by definition a measure of change. So, including these will negatively impact the significance score. So we give you the opportunity to remove them.
Intra Experimentation Normalization
The next step is to perform intra experiment normalization. Click the normalization button. From here, select auto normalization. Auto normalization sums the intensity of all reporter ion channels of all quantifiable peptides. This is then used as a global ratio within the experiment.
With these options set it is now possible to compare multiple experiments with TMT or iTRAQ labeling.
Thank you for listening; if you’d like to try PEAKS Q with your own data you can request a demo at biosoft.ca. Also, subscribe to our channel to learn more about PEAKS, complete software for proteomics.