Analysis of SUMO-Isopeptides with Atypical Tryptic Iso-chains and Shorter Iso-chains Derived from Alternative Digestion Strategies
SUMO-Isopeptides with Atypical Iso-chains Generated from Tryptic Digestion
The inherently challenging physicochemical nature of the full-length tryptic SUMO iso-chains renders these highly charged SUMO-isopeptides refractory to general proteomic analysis. Therefore, a number of research groups have used proteomic strategies that are directed toward further improving the MS- based proteomic amenability of these SUMO-isopeptides with a view to enabling the high-throughput analytical analysis of putative SUMO-isopeptides from more complex biological samples or samples reflecting biological complexity. A set of such approaches (which are not discussed at great length in this chapter) involves mutagenesis to deplete the SUMO protein of lysine residues and substitute arginine into the iso-chain in order to generate SUMO- isopeptides with shorter iso-chains [41, 42]. While these mutagenic approaches mark the improvement in the analytical analysis of SUMO-isopeptides and the number of SUMO-isopeptides identified, the adverse biological impact that lysine depletion and multiple amino acid substitutions have on the SUMOylation of target proteins cannot be overlooked . In order to avoid the adverse biological impact of mutagenic approaches, a strategy that involves the MS analysis of SUMO(2/3)-isopeptides that have been derived from atypical tryptic cleavage of wild-type SUMO(2/3)ylated proteins and SUMO(1)- isopeptides that have been derived from the independent use of alternative proteolytic digestion of SUMO(1)ylated proteins has been devised.
Mass spectrometry sequence-grade trypsin and Lys-C can also independently generate less frequent atypical cleavages along the SUMO iso-chain, which are much shorter than those generated from typical full-length tryptic and Lys-C cleavages. For example, the iso-chains that are generated from atypical tryptic cleavage C-terminal to glutamine residues at positions 88/89 SUMO(2) iso-chains and 89/90 SUMO(3) iso-chains contain 4 and 3 amino acids of identical amino acid composition, QTGG and TGG, which demonstrated more amenability to analytical analysis than the full-length SUMO(2/3)-isopeptides. In addition, alternative enzymes such as elastase used for independent digestion of a SUMO(1)ylated protein also generated SUMO(1)-isopeptides with a short GG iso-chain, again demonstrating amenability to overall MS-based proteomic analyses .
QTOF and LTQ Orbitrap-based LC-MS/MS utilizing low-energy collisioncell CID and low-energy linear ion-trap CID, combined with an unbiased database searching strategy, has been used in an approach to analyze these types of analytically amenable isopeptides generated from independent posttryptic digestion of di-SUMO(2/3)ylated proteins in simple and more complex systems . First, QTOF-based LC-MS/MS was used to develop an approach to analyze the di-SUMO(2)-isopeptides and di-SUMO(3)-isopeptides from simple digestion samples. The conventional Mascot bioinformatic search algorithm commonly used for MS-based proteomic analyses  was used to analyze the data by considering the attachment of consecutive additions of the first 10-amino-acid residues from the C-terminus of the SUMO(2/3) iso-chains as consecutive variable modifications to all available lysine residues. This bioinformatic process was termed “consecutive residue additions to lysine” (CRA(K)). Mascot generated a list of putative SUMO(2)/(3)-isopeptides, and all isopeptide ion hits were considered. The MS/MS spectra generated from low-energy collision-cell CID analysis of the putative isopeptides were manually interpreted using theoretical backbone fragment ions generated from Mascot, along with theoretically calculated iso-chain ions to confirm the identity of the putative SUMO(2)/(3)-isopeptides. The CID MS/MS spectra displayed were of advantageously limited complexity, which is a function of the short TGG and QTGG iso-chains. A comprehensive series of predominantly singly charged y- type product ions from the isopeptide backbone was observed in the CID MS/ MS spectra along with the evidence of additional singly charged b'-type product ions generated from the TGG and QTGG iso-chains, enabling comprehensive structural elucidation of the SUMO(2)/(3)-isopeptides (Figure 6.2).
The CRA(K) method developed to analyze these SUMO(2)/(3)-isopeptides from simple digestion samples was validated on a more complex sample: an anti-Ub pulldown of an HEK293-T cell lysate. The much faster LTQ Orbitrap- based LC-MS/MS was used to analyze a tryptic digest of the more complex Ub/Ubl enriched protein sample and submitted for bioinformatic analyses using the Mascot-enabled CRA(K) approach. Mascot generated a list of putative SUMO(2)/(3)-isopeptides, and all isopeptide hits were considered. The MS/MS spectra generated from low-energy linear ion-trap CID analysis of the putative isopeptides were manually interpreted to confirm the identity of the putative SUMO(2)/(3)-isopeptides. The QTOF-based LC-MS/MS analysis of a SUMO(1)-isopeptide derived from the elastase digestion of a SUMO(1) ylated RanGap protein fragment was also performed. The Mascot-enabled CRA(K) approach was applied in the same way, although relaxed enzyme specificity was selected due to the use of elastase. Mascot generated the putative SUMO(1)-isopeptide containing a GG iso-chain. Consequently, GG is also the iso-chain generated from tryptic digestion of Ub-isopeptides derived from ubiquitinated proteins and not fully diagnostic to SUMO; however, the SUMO(1)ylated RanGap protein fragment was subjected to tryptic digestion and did not result in the generation of any subsequent or similar SUMO(1)- isopeptide. The MS/MS spectra generated from low-energy collision-cell CID analysis of the putative isopeptides were manually interpreted to confirm the identity of this putative SUMO(1)-isopeptide. The CID MS/MS spectra displayed a comprehensive series of predominantly singly charged b- and y- type product ions from the backbone of the SUMO(1)-isopeptide (Figure 6.3), enabling its comprehensive structural elucidation and confirmation of the presence of the GG iso-chain. Interestingly, the series of higher-order b-type product ions observed are uncharacteristic of collision-cell CID data  and is likely due to the isopeptide backbone containing a very basic gas-phase histidine amino acid residue with a basic side chain at its N-terminus, allowing for b-type product-ion stabilization in addition to y-type product-ion stabilization in the collision cell in a similar way to the product-ion spectra observed when having a lysine at the N-terminus when using Lys-N .
The CRA(K) approach does not require specialist software algorithms for analysis and instead uses widely available Mascot software thereby maximizing the successful LC-MS/MS analysis and subsequent data analysis of analytically amenable SUMO(1)/(2)/(3)-isopeptides generated from independent proteolytic digestion of SUMO(1)/(2)/(3)ylated protein.
Figure 6.3 A collision-cell CID MS/MS spectrum of a SUMO(1)-isopeptide derived from the elastase digestion of SUMO(1)ylated RanGapI 418-587 protein fragment. Source: Chicooree, 2013 . Reproduced with permission of Wiley.