Analysis of Protein Phosphorylation by Mass Spectrometry
Prior to the advent of “soft” ionization techniques for MS, the analysis of protein phosphorylation was commonly carried out by performing extensive, iterative enzymatic digestion of 32P-labeled proteins, followed by a variety of electrophoretic and chromatographic separations . Phosphoamino acids were identified from partial acid hydrolysates by comigration in a 2D separation of 32P-labeled amino acids with authentic phosphoamino acid standards whose location was marked by ninhydrin staining . The cellulose thin-layer plates used in these separations were often exposed for days to detect in vivo phosphorylated amino acids. Determining the sequence-specific location of the phosphoamino acid was accomplished by measuring radioactivity released from 32P-containing peptides after iterative cycles of manual Edman degradation . With the introduction of fast atom bombardment (FAB) in 1984 , MS would eventually become the default method for the analysis of protein phosphorylation, eliminating many of the sequential enzymatic digestion and separation steps. However, the use of 32P in combination with MS would continue into the early 2000s.
FAB brought to MS the ability to ionize peptides without fragmenting them . The adduction of a proton to the intact peptide produced an [M+H]+ ion, which made it a trivial task to determine the molecular weight of the peptide. Because FAB worked well with simple mixtures, it was now also possible to analyze enzymatic digests of proteins and map the derived peptide masses onto the established or predicted amino acid sequence of the protein . For larger proteins, the enzyme digests could be fractionated into simpler mixtures using reversed-phase (RP) HPLC. In this way the translated amino acid sequence of even very large genes could be confirmed [68, 69].
PTM of a protein results in a change in mass of the modified amino acid. This mass shift is readily observed on peptides containing the modified amino acid. Thus it was quickly realized that phosphorylated peptides might be readily observed in the FAB spectrum of a protein digest by looking for peptide masses 80 Da (HPO3) higher than predicted. The first reported analysis of a phosphoprotein by FAB was the identification of a 23-amino acid peptide from chicken egg yolk riboflavin-binding protein that contained eight phosphorylation sites . In this case the authors used 25 pg of peptide to record the spectrum and were fortunate in that the protein was homogeneously and stoichiometrically phosphorylated. The phosphorylated peptide was identified by comparison with peptides from a dephosphorylated protein. In 1986 it was demonstrated that peptides could be sequenced by tandem mass spectrometry (MS/MS) , and a few years later the same group sequenced three phosphopeptides from spinach chloroplast Photosystem II proteins . While MS/MS was clearly an emerging technique, throughout most of the 1980s the location of specific sites of phosphorylation was still most conveniently done by measuring the release of 32P during Edman sequencing [64, 73].
In 1988 two new soft ionization techniques were introduced, which would revolutionize the analysis of biomolecules. Franz Hillenkamp and Michael Karas showed that matrix-assisted laser desorption/ionization (MALDI) could ionize large proteins and produce singly charged intact protein ions that could be mass measured in a time-of-flight (TOF) mass spectrometer . As early as 1988 proteins as large as an IgG (Mr = 149,190) and the tetrameric glucose isomerase (Mr = 172,460) were analyzed as intact biopolymers . While MALDI shared many analytical attributes with FAB, it was more sensitive for peptides by as much as 3 orders of magnitude, was less susceptible to contaminants and incipients in the sample, and, as mentioned, had a very large mass range. It was soon shown that peptides  and phosphopeptides  could be sequenced at the sub picomole level using MALDI on a reflectron TOF mass spectrometer by recording the metastable fragment ions produced in the field-free region of the drift tube. Metastable ions produced by the loss of various phosphate-containing groups from the side chain of phosphoamino acids made it possible to distinguish tyrosine phosphorylation from serine and threonine phosphorylation on peptides . An abundant [MH-H3PO4]+ ion, accompanied by a smaller [MH-HPO3]+ ion, indicates that a peptide is most likely phosphorylated on serine or threonine. In contrast, phosphotyrosine- containing peptides generally exhibit [MH-HPO3]+ fragment ions and little, if any, [MH-H3PO4]+.
While MALDI continues to be used as an important tool in the analysis of proteins and peptides, electrospray (ES) ionization, introduced by John Fenn in 1989 , has emerged as the dominant MS method for the analysis of most types of biomolecules. From an analytical perspective, ES ionization differs from MALDI largely in the fact that it is a solution-based ionization method and that it produces primarily multiply charged ions from proteins and peptides. While a description of the mechanistic features of ES ionization is outside of the scope of this chapter, it can be generalized that with the mass spectrometer operating in the positive ion mode, electrosprayed proteins and peptides will produce intact molecular ions containing approximately as many charges as there are basic groups on the molecule or for large proteins one charge for every 1000 Da of molecular mass. For proteins this usually results in a series of molecular ions, each containing a different number of charges. Most peptides are typically represented by only a few charge forms.
Perhaps the biggest advantage that ES ionization has over MALDI is that being a solution-based ionization method, it is easily interfaced to liquid chromatography, in particular RP-HPLC. Furthermore, because of voltage considerations in the ion source, ES ionization was not easily implemented on magnetic sector mass spectrometers and thus was interfaced primarily on low- voltage, less expensive quadrupole mass spectrometers, which also conveniently handled the flow rate of conventional HPLCs. Because of the multiple charging phenomenon, typical of ES, the charge envelope of a protein or peptide always falls more or less in the same mass range (m/z 500-3000), regardless of the size of the protein or peptide, and this mass range is also conveniently within the operating mass range of most quadrupole mass spectrometers. Thus for a variety of practical and theoretical reasons, ES was primarily implemented on quadrupole mass spectrometers, which made the technique much more accessible to a large research community and galvanized investigations into how to use this important new tool for biological research. The development of the triple quadrupole-tandem mass spectrometer  interfaced via an ES ion source made possible the direct sequencing of peptides as they flowed into the mass spectrometer .
Because the signal intensity produced in an ES mass spectrometer depends on concentration rather than absolute amount of the analyte, it is desirable to use the lowest flow rate possible to achieve the best sensitivity. Already by 1992 microcapillary HPLC flowing at 1-2 pL/min had been interfaced to an ES ion source . In 1994 Wilm and Mann introduced the concept of nanoelectrospray  where they flowed peptide and protein samples at a rate of 20-30 nL/min from pulled glass capillaries . This method was widely adopted and became known as static nanospray since no HPLC pumps or separations were employed. In the same year Emmett and Caprioli introduced a microelectrospray source that accommodated 300-800 nL/min, spraying directly from a capillary needle packed with C18 RP material . The realization that ES was stable at these low flow rates quickly led to the development of nanoliter flow rate LC (NanoLC) coupled with ES ionization MS [85, 86]. NanoLC-MS technology, having matured over the last twenty years, when combined with modern biochemical and cell biology techniques, provides a tool that has the sensitivity and selectivity necessary for the analysis of in vivo biology regulated by protein phosphorylation.
To understand how phosphorylation regulates biology, the following questions need to be addressed: “Is a protein phosphorylated?” “Where is it phos- phorylated?” “Is the phosphorylation regulated?” “What is the stoichiometry?”
Each of these questions poses a unique analytical challenge, and a variety of strategies utilizing MS can be applied to meet the challenges.
Unlike the riboflavin protein mentioned earlier , most proteins are phosphorylated at substoichiometric levels, often as low as a few percent. Therefore, in practice identifying phosphopeptides in anything other than the simplest mixtures can be quite challenging. In addition to the problem of low stoichiometry, the ion yield for peptides in general can be very different, with the result being that some peptides are difficult to detect regardless of the complexity of the sample. The ion yield for peptides is the product of both the ionization efficiency of the molecule (which in turn is related to proton affinity and gas-phase basicity) and the rate of vaporization of the molecule in the ES process. These factors are strongly dependent on the physicochemical properties of the side chains of the individual amino acids that make up the peptide. In general peptides with hydrophobic amino acids desorb more efficiently from the charged droplets created by the ES process, and basic residues enhance the ionization efficiency of the molecule. However, since both Arg and Lys are rather hydrophilic, more than a single Arg or Lys will decrease the desorption efficiency of the peptide [87, 88]. The influence of the addition of one or more phosphate groups to a peptide, and how this affects the ionization efficiency, is a matter of some debate. A commonly held belief is that phosphorylated peptides have lower ionization efficiencies than their non- phosphorylated counterparts. Conceptually this would seem to make sense. The phosphate group is negatively charged and therefore should affect the ionization efficiency under the acidic conditions employed in the positive mode. In addition, phosphoamino acids are very hydrophilic, thereby affecting the vaporization of the peptide from the charged droplets. In practice, however, it is evidently more complicated. In one of the largest studies done on this subject, a diverse set of twenty synthetic phosphopeptides (P) were mixed in two different defined ratios with their nonphosphorylated (NP) counterparts and measured by positive ion ES using low-pH RP-HPLC to introduce the samples. Surprisingly more phosphopeptides demonstrated better ionization efficiency than their nonphosphorylated counterparts, and in almost 70% of the cases the ratio of P/NP was within a factor of two . Most of these peptides were nontryptic and several contained multiple basic residues, particularly arginine. In the case of very basic peptides, the ionization efficiency was strikingly charge state dependent. In contrast, a separate study examining 66 model peptides and their singly phosphorylated counterparts found that in nearly all cases the phosphopeptides had lower ionization efficiency than their nonphosphorylated counterparts . But here again, the average difference in relative ionization efficiency was only a factor of two. In this study three sets of peptides were created based on a single sequence and all contained a C-terminal Lys residue. The charge and hydrophobicity in each set were varied by making fixed substitutions at specific sites within the sequence. However, there was no apparent correlation between the relative response of the peptides and their molecular weight, charge, or hydrophobicity. Interestingly, those peptides containing an Arg residue showed relative ionization efficiencies close to 1, consistent with the previous study and work in our own lab . Since the most common approach to protein analysis by MS involves digestion of the protein with trypsin, it is fair to ask whether tryptic phosphopeptides have similar relative response ratios as those described previously. Nearly all of the peptides in the two studies described earlier were decidedly nontryptic. Unfortunately, no large studies using matched synthetic tryptic peptides have been conducted to assess the effect of phosphorylation on ionization efficiency. From the literature we were able to extract measured relative intensity ratios for 29 fully tryptic phosphopeptides and their counterparts [89, 92-96], and in all but one case the ratio was very nearly 1.
A less well-controlled but still informative alternative to using synthetic peptides to measure relative responses is to treat half of a phosphopeptide sample with phosphatase and then measure the response of the phosphorylated and dephosphorylated sample. Using a TiO2-enriched phosphopeptide sample, Marcantonio  matched 452 tryptic phosphopeptides to their dephospho- rylated counterparts and determined that on average the phosphorylated peptides had a 1.5-fold lower response. Whereas only 33% of singly phospho- rylated peptides had a lower response than their dephosphorylated counterpart, multiply phosphorylated peptides were twice as likely to show a greater than twofold lower response.
From the aforementioned cases we can conclude that the ionization efficiency of phosphopeptides relative to their nonphosphorylated counterparts is difficult to determine. For singly phosphorylated peptides, they would appear, in general, to have a response similar to that of the nonphosphorylated counterpart, in most cases within a factor of two. However, there are certainly cases where this is not true and where the difference is dramatic. Unfortunately, these differences are not predictable. Even the position of the phosphorylated residue in an otherwise unaltered sequence can have a dramatic effect on the response . The uncertainty in the relative response of the phosphorylated and nonphosphorylated form of a given peptide makes determining phosphorylation stoichiometry problematic. The extent of phosphorylation is determined by dividing the abundance of a phosphopeptide by the sum of the abundances of the phosphorylated and nonphosphorylated forms. For measurements made by MS, the ion intensities for all charge states of each form are used. However, this determination only holds true if the responses of the two forms of the peptide are equal. While the data presented earlier would suggest that at least for singly phosphorylated peptides any error resulting from making this assumption is likely to be small, we prefer, in the absence of any empirical evidence, to refer to determinations made in this way as “apparent” stoichiometry. The uncertainty in these apparent stoichiometries is nevertheless an uncomfortable situation, and so alternative ways to determine phosphorylation occupancy are important to consider.
The most straightforward, simplest, and by far most widely applied approach to deriving absolute stoichiometry is to synthesize the phosphorylated and nonphosphorylated forms of the peptide and measure their relative responses in a series of mixtures. If the measured responses are linear over a range of mixtures, then the response ratio (if not equal to 1) can be used to correct the apparent stoichiometry [94, 99]. If the peptides are synthesized using stable isotope-encoded amino acids, then the peptides can be added directly to the sample, providing both a measure of absolute abundance and a way to calculate stoichiometry . This latter approach has the advantage that if the target peptide is produced with ragged ends due to adjacent tryptic cleavage sites, peptides, which extend across the cleavage sites, can be synthesized and added to the sample prior to digestion. In this way internal standards are produced regardless of the preference of the enzyme for the cleavage sites . Peptide synthesis has become relatively inexpensive, and it is now possible to buy peptides in smaller quantities. It is still a requirement that the peptides are pure or the final determination of the absolute stoichiometry will be no more certain than the apparent stoichiometry.
Clearly this approach does not scale well as the number of sites to be determined increases. Nor is it always possible to synthesize peptides for certain long or multiply phosphorylated species that are unstable over time. As an alternative, our laboratory described an approach that uses phosphatase digestion and differential chemical labeling to determine absolute stoichiometry . As shown in Figure 2.3a, an enzymatic digest is split into two equal parts, and one part is treated with phosphatase and the other undergoes a mock treatment. Both samples are then derivatized with a chemical label, one light and the other containing heavy stable isotopes. After mixing, the sample is analyzed by MS. All nonphosphorylated peptides will appear as doublets with equal intensity ratios. The presence of a doublet with a different intensity ratio followed by a singlet peak 80 Da higher in mass is an indication of a partially phosphorylated peptide. In the example shown in Figure 2.3, a sample containing the three peptides shown (Figure 2.3b) is taken through the protocol and reanalyzed (Figure 2.3c). The nonphosphorylated peptide (?) generates two peaks of equal intensity at m/z 723.81 and 726.31. The phosphorylated (O) and nonphosphorylated (•) peptide pair give rise to three peaks. The peak at m/z 706.31 arises from the half of the sample that was treated with alkaline phosphatase. It represents the total amount of the sequence present in that half of the sample (sum of phosphopeptide and nonphosphopeptide). The peak at m/z 703.81 comes from the half of the sample that was not treated with phosphatase, and it thus represents the original amount of nonphosphorylated peptide. The peak at m/z 743.81 also comes from the half of the sample that was not treated with phosphatase, and it thus represents the original amount
Figure 2.3 Determination of absolute stoichiometry. (a) Schematic showing the generalized protocol. Any suitable label that provides an adequate mass shift may be used. (b) Partial ES mass spectrum of a sample containing a phosphorylated (O) peptide and its nonphosphorylated counterpart (•). The apparent stoichiometry determined from this spectrum is 31%. (c) After being taken through the protocol, each peptide adds one propionyl group, either d0 or d5. The absolute stoichiometry determined from the nonphosphorylated d0/d5 cluster is 45%. See text for details.
of the phosphorylated peptide. The stoichiometry (S) is determined from the intensities of the nonphosphorylated pair using the equation S = 1 - (Il/Ih), where IL is the intensity of the untreated sample tagged with the light label (L) and IH is the intensity of phosphatase-treated sample tagged with the heavy label (H). In this case there is no question of different ionization efficiencies, and the increase in intensity of the nonphosphorylated peptide originating from the phosphatase-treated sample represents that portion of the sequence that was originally phosphorylated . In this example the absolute stoichiometry is found to be 45%. Peptides that are stoichiometrically phosphoryl- ated undergoing this protocol would be represented by a nonphosphorylated heavy-labeled singlet and a corresponding light-labeled phosphorylated singlet.
This basic strategy has been adapted to a variety of chemical labels [93, 102, 103] and was recently shown to be applicable on a large scale. Gygi and coworkers determined the stoichiometries for more than 5000 phosphorylation sites in asynchronous cultures of Saccharomyces cerevisiae. This study showed that only about 10% of the sites identified showed full or nearly full occupancy, confirming the overall low stoichiometry and suggesting that phosphorylation regulates function by influencing only a small fraction of the available protein molecules .
As simple and elegant as the previous approach is, there are a number of important caveats that need to be kept in mind. It is important to perform the phosphatase reaction on the peptide level, as the activity of phosphatases on proteins may be unpredictable. Differentially phosphorylated peptides will collapse down to a single nonphosphorylated species. Thus the determined stoichiometry will be for the total amount of phosphate on that peptide. It will be impossible to discriminate the stoichiometry for multiple sites of monophosphorylation on a single peptide or the contribution of mono- and diphospho- rylated peptides sharing the same sequence. Finally there are two circumstances that are problematic for any method used to determine site-specific stoichiometry. The first is the ability to distinguish the relative contribution of two different sites on the same peptide, and the second is when a site occurs adjacent to an enzyme cleavage site. In the case of the latter, the nonphosphorylated peptide will have a completely different (shorter) sequence. For a single protein or a simple mixture, these situations will likely be recognized and might be resolved by choosing alternative enzyme digestions. For a complex sample such as a cellular proteome, these circumstances will be more difficult to resolve and will undoubtedly complicate the interpretation of the data.
Once a peptide has been identified as being phosphorylated, the next major analytical challenge is to localize the site of phosphorylation. Peptides are most commonly sequenced by MS/MS using collision-induced dissociation (CID). Collisions with an inert gas cause the protonated peptides to fragment at the amide bonds along the backbone. Because the proton, which directs the fragmentation, can reside at multiple locations along the backbone, a distribution of fragment ions will result. Charged fragments containing the N-terminus form a bn ion series, while charged fragments containing the C-terminus constitute a yn ion series (a subscripted number denotes the nth amino acid from the terminus where the fragmentation occurred). Fragmentation along the backbone is not completely random, since certain amino acids influence the localization of the proton. This influence is manifest in the different intensities of the various b and y ion fragments. In addition to the amide bond cleavages, both peptide molecular ions and fragment ions can undergo neutral loss from the amino acid side chains. With a few exceptions, phosphopeptides fragment much like nonphosphorylated peptides. A detailed description of peptide fragmentation via CID can be found elsewhere .
Fragmentation readily occurs at those locations where the least amount of energy is required to break a chemical bond. Fragmentation of a phosphopep- tide on the side chain of a phosphorylated amino acid is an energetically favorable process that competes very effectively with cleavage at the amide bonds. The facile cleavage of H3PO4 (98 Da) and HPO3 (80 Da) from the phos- phopeptide precursor ion often gives rise to very abundant neutral loss ions (M-H3PO4 and M-HPO3, respectively). In the case of phosphoserine- or phos- phothreonine-containing peptides, the neutral loss of H3PO4 can dominate the spectrum, suppressing the formation of sequence-specific b and y ions. Phosphotyrosine-containing peptides on the other hand are not able to lose H3PO4. The aromatic ring on tyrosine stabilizes the carbon-oxygen bond. Since the oxygen-phosphorus bond is much weaker, cleavage here results in the loss of HPO3. Because the loss of the phosphate group from tyrosine is much less facile than from either serine or threonine, the resulting neutral loss ion is much less abundant in the spectrum of phosphotyrosine-containing peptides. The charge state of the precursor and the number of basic residues (related characteristics) also strongly influence the neutral loss of phosphate. In general, higher charge state precursors tend to show a lower degree of neutral loss. A detailed examination of the neutral loss pathway  and the sequence-related factors that influence it  is described elsewhere.
The prevalence of fragment ions resulting from the neutral loss of phosphate is also dependent on two nonsequence-related factors. The collision energy imparted during the CID process and the time frame in which it takes place can have a profound effect on the production of neutral loss ions. Ion trapping type instruments use resonant excitation of cool ions to induce hundreds of collisions over a relatively long period of time. Only a few vibrational quanta of energy are transferred with each collision . This slow heating process favors low-energy fragmentation pathways such as the neutral loss of phosphate. Thus ion trap MS/MS spectra of serine- and threonine-phospho- rylated peptides tend to be dominated by very abundant ions resulting from the neutral loss of phosphate from the precursor (taking into account the other factors described earlier) . On the other hand, instruments that are tandem in space accelerate ions into a gas-filled collision cell, where only a handful of collisions occur and much greater energy is transferred at each collision (typically 15-40 eV). CID under these conditions tends to favor backbone cleavages. Technically referred to as low-energy CID (relative to magnetic sector instruments where collision energies are in the 5-10 KeV range), this process still imparts much more energy than the very-low-energy resonance excitation process. Furthermore, the fragment ion spectrum can be readily tuned by adjusting the collision energy through increasing or decreasing the accelerating potential on the precursor ion. Recently, separate multipole collision cells have been introduced into orbitrap and quadrupole orbitrap instruments, combining the best features of true collision cell CID and the sensitivity and high performance of the orbitrap. Unfortunately termed higher-energy collision-induced dissociation (HCD), the collision energies in this regime are nevertheless similar to those of conventional low- energy CID. As expected, the phosphate neutral loss is less prevalent on these instruments, and the fragmentation pathways can be controlled to some extent by adjusting the collision energies.
The human proteome contains nearly 15% serine, threonine, and tyrosine by amino acid composition. Thus most phosphopeptides will have more than one possible location for the phosphorylation to reside. To be certain of the phosphorylation site location, an ion series of primary backbone fragments should run across the phosphorylated residue. In practice this is not so easily achieved. Phosphoserine- and phosphothreonine-containing bn and yn ions readily lose phosphate to produce bn-H3PO4 and yn-H3PO4 ions (designated bnAand ynA, respectively). Interestingly, bn ions are much more likely to exhibit this loss . It is not clear whether this is because the neutral loss from a bn ion forms a more stable product or that bn ions are more likely to undergo a neutral loss. Regardless, this suggests that since yn ions are more likely to contain an intact phosphoamino acid, this series may be more valuable in localizing the site of phosphorylation. In addition to the loss of phosphate, peptides can also readily lose H2O (-18) and NH3 (-17) as neutrals from both the precursor and backbone fragment ions. For peptides phosphorylated on serine or threonine, the subsequent loss of phosphate and water from bn and yn ions dramatically complicates the assignment of the phosphorylation site.
The spectrum shown in Figure 2.4a has been assigned to the sequence SASQSSLDKLDQELK plus two moles of phosphate. An unmodified b2 ion (SA-) indicates no phosphorylation on S1. The next two b ions resulting from cleavage after S3 (SAS-) and Q4 (SASQ-) are represented as b3A and b4A, suggesting S3 is phosphorylated. The b5A ion representing the fragment SASQS- could result from the loss of phosphate from either S3 or S5; however, since there is no b5AA, the latter is more likely. Cleavage after the next amino acid (SASQSS-) results in a b ion that losses two phosphate groups to produce b6AA. This suggests that the second phosphate group is on S6 not S5. However the lack of an intact ion b5 or b6 ion leaves some doubt. There is an intact y ion series from y1 to y9 confirming the C-terminal sequence. Unfortunately the next y ion (-SLDKLDQELK) could represent either y10A or y10 , and the possibility of phosphorylation on either S5 or S6 converges after y10. The appearance of y13AA (two losses of phosphate) reinforces the assignment of phosphate on S3. So from this spectrum the phosphorylation is assigned to S3 and S6, but the evidence is not unambiguous.
The second spectrum shown in Figure 2.4b is from a singly phosphorylated peptide with the sequence TSSIADEGTYTLDSILR. An extensive primary y ion series from y1 to y14 indicates that there is no phosphorylation on the four possibilities in the C-terminus. A single N-terminal fragment, represented by an unmodified b2 ion, suggests that the phosphate is on S3. With no other supporting evidence, point mutants were made for each the first three amino acids, and phosphorylation exclusively on S3 was confirmed using an in vivo activity assay .
The ability of MS/MS to localize sites of phosphorylation may be further compromised by the recent evidence that in the gas phase, phosphate groups can migrate to other hydroxyl-containing amino acids [110, 111]. In a study of
Figure 2.4 Phosphorylation site localization by tandem MS. (a) CID spectrum of a doubly phosphorylated peptide assigned by MASCOT to the sequence SASQSSLDKLDQELK. The doubly charged precursor was fragmented using HCD on a quadrupole orbitrap hybrid. Although the spectrum is not dominated by the neutral loss of phosphate from the precursor, the abundant loss of H3PO4 from fragment ions makes localization of the sites difficult; however there is evidence for the localization as shown. See the text for details.
(b) CID spectrum of a singly phosphorylated peptide assigned to the sequence TSSIADEGTYTLDSILR. The doubly charged precursor was fragmented using low-energy CID on a quadrupole time-of-flight hybrid. Notice the residual unfragmented precursor (marked •). Neutral loss of phosphate from the precursor is a minor ion (not marked). With only a single N-terminal fragment, it was not possible to unambiguously assign the phosphorylation on any of the first three residues. See the text for details. Peptide fragment ion nomenclature is that of Biemann , except that bn or yn ions marked with either ? or ?? refer to bn or yn-H3PO4 or bn or yn-2(H3PO4), respectively. Peaks labeled with only ? or ?? refer to [M+2H]2+ minus one and two H3PO4 groups. Loss of water from the preceding bn or yn ion is marked *.
thirty-three fully tryptic synthetic phosphopeptides, 45% showed evidence for rearrangement of the phosphate group. These rearrangements were most prevalent in peptides that showed a significant neutral loss of phosphate from the precursor. Interestingly, these rearrangements were not observed in mass spectrometers that are tandem in space , the much shorter time frame of the CID process being inconsistent with the rearrangement reaction .
The inability to localize the site of phosphorylation due to a lack of backbone fragmentation has led to the development of methods, primarily in trapping instruments, to activate and sequence the prominent M-H3PO4 ions . After a typical MS/MS (MS2) data acquisition sequence is performed and the neutral loss ion identified in a data-dependent manner, the trap is refilled and the sequence repeated. This time however, following the initial MS2 fragmentation, the neutral loss ion is isolated, activated, and fragmented. Since the precursor for this round of fragmentation does not contain a phosphate group, these MS3 spectra contain primarily backbone cleavages. Phosphoserine and phosphothreonine residues will have the in-chain masses of dehydroalanine (69 Da) and dehydrobutyric acid (83 Da), respectively. A complicating issue in this approach is the recent evidence that what appear to be M-H3PO4 ions might actually derive from the loss of HPO3 (-80) followed by the simultaneous loss of H20 (-18) from a nearby serine or threonine . In this case, the MS3 spectrum of an M-98 ion would be incorrectly interpreted. The actual phos- phoserine residue would have the in-chain mass of an unmodified serine, and the actual unmodified serine would have the in-chain mass of dehydroalanine. Even more complicating would be the situation where the M-98 ion was a mixture of M-H3PO4 and M-HPO3 followed by M-H2O. The combined loss of HPO3 and H2O was much less prominent in HCD spectra , and we would expect this also to be true for conventional low-energy CID tandem-in-space instruments.
An alternative to performing MS3 acquisitions is a technique called multistage activation (MSA) . In this approach the phosphopeptide precursor ion is activated and fragmented as usual, but the fragment ions are left in the trap, and the m/z at which the neutral loss ion would appear is further activated and undergoes additional fragmentation. All the ions are collected in one spectrum, which is a composite of the first and second activation processes. The main advantages of the MSA acquisition over an MS3 acquisition are that they take much less time, that they result in only a single spectrum, and that the signal-to-noise ratio of MSA spectra is much better.
The value of performing either MS3 or MSA is debatable. For proteome-scale studies it is not clear that the impact on the duty cycle of the experiment is offset by an increase in phosphopeptide identifications. The evidence in the literature is quite conflicting [113, 116-119]. The assumption is that while fewer spectra are acquired, the quality of these spectra will result in more identifications. MSA would seem to have an advantage here in that it does not impose as large a hit on the sequencing duty cycle. In practice MS2 CID spectra, while not very “good looking" frequently contain sufficient backbone fragmentation to identify the sequence and in many cases localize the site of phosphorylation. The evidence that the additional stages of activation add anything substantial to the sequence data already present in an MS2 spectrum is not compelling. In our own laboratory we find that more class 1 phosphopeptide identifications are made with MS2 than with MSA, even if the MASCOT scores of the latter are better. In practice these approaches are not widely used in phosphoproteomics studies, but may prove more useful in the analysis of isolated phosphoproteins or very simple mixtures where the duty cycle imposed by the experiment is not an issue.
In addition to the sequence and structural features unique to phosphopep- tides, certain sequence features common to all peptides complicate the localization of phosphorylation sites. Very long peptides often suffer from the fact that ions series tend to die out as the collision energy is dissipated over the molecule, leaving large gaps in the sequence information. Tryptic peptides with missed cleavages have internal basic amino acids, which hold a charge firmly and inhibit cleavages based on the mobile proton model of fragmentation. Peptides can undergo internal cleavages that lead an out-of-context sequence series. A problem unique to ion trap instruments is the lack of information at the low mass end of the spectrum due to the 1/3 cutoff rule. An unwelcome feature of resonance excitation is the loss of all ions below 1/3 the mass of the precursor. This problem is not an issue in hybrid trapping instruments that employ an HCD collision cell or in other mass spectrometers that are tandem in space.
To mitigate the labile nature of many PTMs, including phosphorylation and the impact this has on localizing the site of modification as well as circumventing some of the other less desirable sequence-specific features of CID, low-energy electron-based dissociation techniques have been developed. In electron capture dissociation (ECD) [120, 121], protonated precursor ions capture low-energy electrons (0.2 eV), leading to rapid charge neutralization and fragmentation that is very specific to the amino acid N-Ca bond. ECD spectra are dominated by backbone fragments of the c and z« (radical) type and show very little neutral loss or side chain fragmentation. ECD spectra of phospho- peptides are characterized by long runs of contiguous sequence ions, where the phosphate group stays intact. Unfortunately ECD suffers from low fragment ion yields, poor sensitivity, and the need to perform the experiment in an ion cyclotron resonance mass spectrometer (FTICR).
Electron transfer dissociation (ETD) was developed by Syka and Coon  a few years later as an alternative electron-based dissociation technique that could operate in an ion trap. During ETD, radical anions are formed in a separate chemical ionization source and then mixed with protonated peptides in the ion trap. Electrons are then transferred to the peptides by ion-ion reactions. Although the ionization process is different, the fragmentation mechanism in ETD is thought to be mechanistically similar to ECD, producing the same type of fragment ions and little or no neutral loss fragmentation.
Common to both types of electron-based dissociation methods is the fact that they work more efficiently on smaller peptides with a charge of 3+ or higher, that is, peptides with a greater charge density . Trypsin, the most commonly used enzyme for digestion in proteome studies, produces mostly peptides with a 2+ charge. In a comparison of CID and ETD for tryptic peptides
 , less than 1% of the identified ETD peptides were 2+. Interestingly, Lys-C peptides, fifty percent of which would be expected to contain an internal arginine and therefore be 3+ or higher, performed no better than tryptic peptides
 . This is likely due to the fact that Lys-C peptides containing an internal arginine would be longer and therefore not have an increase in charge density. To account for the fact that CID outperforms ETD for 2+ peptides, a decision tree strategy has been developed, which allows the mass spectrometer to opt for CID or ETD depending on the charge state and m/z of the peptide . Recently, it has been shown that HCD outperforms ETD in a phosphopro- teome experiment for the identification ofphosphopeptides and the localization of the site [32, 126], although ETD does perform better on multiply phospho- rylated peptides. HCD is now in widespread use on commercial orbitrap-based mass spectrometers, providing true collision cell fragmentation, with no low mass cutoff and high-resolution mass measurement of fragment ions. The most current generation of quadrupole TOF mass spectrometers offers similar performance characteristics. While ETD remains an attractive alternative ionization technique, its use is still not widespread, and currently it does not show signs of replacing low-energy CID for the routine sequencing of phosphopeptides.
Isomeric phosphopeptides, which are identical in sequence but carry the phosphate group on different residues, can also cause difficulties in analysis. Isomeric phosphopeptides that cannot be separated by standard LC procedures will not be resolved in the MS1 precursor scan due to their identical masses. This limitation will result in a mixed fragment ion population in MS2 and subsequently confound phosphosite localization efforts . Gas-phase ion mobility separation is a way to resolve this circumstance. Usually these devices operate between the LC system and the first mass analyzer and separate different ionic species based on their differential mobility through an electric field that oscillates in strength. Field asymmetric waveform ion mobility spectrometry (FAIMS) has been particularly useful in separating isomeric phosphopeptides , as has been described from a study of a Drosophila cell phosphoproteome .
Continuous significant improvements in MS performance over the last ten years have brought about increasingly faster data acquisition rates and dramatically improved sensitivity that has been coupled with higher mass accuracy and resolution. Despite the overall low stoichiometry, the overwhelming number of nonphosphorylated peptides, and the potential for lower ionization efficiency, high-performance tandem mass spectrometers in use today are very effective at identifying phosphorylation sites in relatively simple mixtures such as the tryptic digest of an immunopurified protein. Rather than attempting to identify phosphorylated peptides based on differences in an MS1 spectrum and targeting those peptides for sequencing, MS/MS is used to shotgun sequence as many peptides as possible and phosphorylated peptides are distinguished from nonphosphorylated peptides by sophisticated database search algorithms that use accurate mass and fragment ion spectra. In this approach phosphopeptides are identified and phosphorylation sites localized in the same event (taking into account all the caveats described earlier). The sequencing speed and high sensitivity of state-of-the-art mass spectrometers ensure very high sequence coverage (80-95%) for even very faintly stained Coomassie Blue bands on SDS-PAGE gels. While high sequence coverage is not a guarantee that all phosphorylation sites will be identified (it only guarantees you have found all the nonphosphorylated peptides), it does increase your chances. If the overall sequence coverage for the protein is low, then digestion with an alternative enzyme is a useful strategy. Alternate enzymes (e.g., lysyl endopeptidase, endoproteinase Asp-N, endoproteinase Glu-C (V8), chymot- rypsin, elastase), used alone or in combination with trypsin, can improve sequence coverage and phosphorylation site identification [130-132]. Increasing the amount of starting material can make a difference, as can switching from an in-gel digest to a solution digest. If the analysis of a phos- phoprotein is to be done by MALDI, then some form of enrichment (described later) will be useful.
For the identification of phosphopeptides from more complex mixtures such as multiprotein complexes or intact proteomes, the same basic shotgun sequencing approach is coupled with a phosphopeptide enrichment strategy. In most cases the global analysis of protein phosphorylation will also require a fractionation step at the peptide or protein level prior to enrichment.