In this application note, we demonstrate an application of MSE, using the Waters SYNAPT MS System coupled with UltraPerformance LC (UPLC), for characterization of a yeast enolase tryptic digest.
Impurities such as modifications and sequence variants generated from transcriptional/translational errors are common in recombinant protein products, and may affect their safety and activity. Effective control of these variants requires sensitive and reproducible methods for protein production monitoring.
Liquid Chromatography (LC)-based peptide mapping is a key method for protein structure characterization and purity analysis. However, ultra violet (UV) or mass spectrometry (MS) detectors in traditional LC/UV or LC-MS peptide mapping methods are unable to characterize unexpected contaminants, although they are sensitive for detection of low-level impurities in recombinant proteins. Additional time-consuming tandem mass spectrometry (MS/MS) measurements are required for the elucidation of unknown sequences. Furthermore, the presence of peptides resulting from unexpected proteolytic cleavages often makes the LC separation and the assignment of LC peaks more difficult.
Recently, LC combined with data-independent acquisition mass spectrometry (MSE) has been employed to analyze peptide maps with a high sequence coverage (>90%).1 Excellent analytical reproducibility was obtained from replicate analyses of the protein digest. In MSE 2-3, the parallel and unbiased data acquisition mode not only overcomes the repeatability limitations of data-dependent acquisition (DDA) LC-MS/MS experiments, but also ensures the sampling of low-abundance peptides from low-level impurities. The obtained MS and MSE spectra of such peptides allow for identification of unknown impurities in the sample.
In this application note, we demonstrate an application of MSE, using the Waters SYNAPT MS System coupled with UltraPerformance LC (UPLC), for characterization of a yeast enolase tryptic digest. Multiple protein contaminants as well as unexpected peptides resulting from non-specific digestion were identified. The results demonstrate that UPLC-MSE methodology is capable of identifying and quantifying low-level impurities in protein products. LC peaks from unexpected partially tryptic and non-tryptic cleavages are assigned and distinguished from peptides originating from impurity proteins.This methodology may also be used to accelerate the development of protein purification strategies.
Yeast enolase and ammonium bicarbonate (NH4HCO3) were purchased from Sigma Chemical Co. (St. Louis, MO, U.S.); sequencegrade trypsin from Promega Corp. (Madison, WI, U.S.), formic acid (FA) from EM sciences (Gibbstown, NJ, U.S.) and optima-grade acetonitrile (ACN) from Fisher Scientific (Pittsburg, PA, U.S.). The water used in all procedures was supplied by a Millipore Mili-Q purification system (Bedford, MA, U.S.).
Yeast enolase was dissolved in 100 mM NH4HCO3 to prepare a 5 μg/μL protein solution. Fifty microliters of this solution were utilized for tryptic digestion, accomplished by adding 5 μg of sequencing-grade trypsin and incubating at 37 °C overnight. The digest was diluted to 1.5 pmol/μL with 5:95 ACN/water containing 0.1% FA prior to UPLC/MSE analysis.
All analyses were performed using a SYNAPT MS System and controlled by MassLynx 4.1 Software. An ACQUITY UPLC System equipped with a 2.1 x 100 mm BEH 300Å 1.7 μm Peptide Separation Technology Column was used for analysis. The separation was performed at 40 °C. Peptides were eluted with a 60 min gradient (0 to 50% B). Mobile phase A was 0.1% FA in water, B was 0.1% FA in ACN. Flow rate was 0.2 mL/min. An auxiliary pump delivered a lockmass solution (100 femtomole (GLu1)- fibrinopeptide B (GFP) in 50:50 ACN/water containing 0.1% FA for mass accuracy reference.
Data-independent MS acquisition in the positive ion V-mode was accomplished by alternating the collision cell energy between low (5 V, transfer cell energy 5 V) and elevated-energy setting (energy ramped from 20 to 40 V, transfer cell energy 10 V). The scan time was 0.5 sec in both acquisition modes (1 sec total duty cycle). In this configuration, both peptide precursor ion spectrum (MS) and fragmentation (MSE) data can be obtained in a single LC analysis.
Capillary voltage of 3.0 kV, source temperature of 100 °C, cone voltage of 37 V, and cone gas flow of 10 L/h were maintained during the analyses. Sampling of the lock spray channel was performed every 1 min. The system was tuned for a minimum resolution of 10,000 and calibrated to mass error of less than 3 ppm using a 100 femtomole GFP infusion before the experiments.
The acquired data were processed using IdentityE Software. The low-energy (MS) and elevated-energy (MSE) data were backgroundsubtracted, de-isotoped and charge-state-reduced to corresponding monoisotopes, lockmass-corrected, and aligned (fragment ions with corresponding precursor ions) by the retention time profile of each ion. The processed data were first searched against a yeast database with trypsin specificity and one potential missed cleavage. Then, the processed data were searched again against the protein sequences returned from the first search with no enzyme specified for partially/non-tryptic peptides. Asparagine (N) deamidation and methionine (M) oxidation were allowed as variable modifications in these searches.
In the UPLC-MSE experiment, two sets of MS data are collected: lowenergy (MS) and elevated-energy (MSE) chromatograms. Low-energy LC-MS data comprise accurate MS data for peptide precursors, while elevated-energy LC-MS data contain fragment ions to their corresponding peptide precursors. IdentityE Software is used to combine the data into MS/MS spectra, and search for peptide sequences.
Figure 1 shows the LC-MS chromatogram of enolase tryptic digest for a 30 pmole sample injected on-column. The chromatogram features more than 100 resolved and detected peaks. The peak assignment (as shown in Figure 1) was made after identification of the peptides via a database search using MSE data (see Data Processing section for details).
Five proteins listed in Table 1 were identified via a search against the yeast proteome database (using trypsin specificity). As expected, enolase 1 was the top hit. 42 tryptic peptides (84% sequence coverage) were identified from this target protein, including three miscleaved peptides and two N-deamidated peptides (two isoforms each). These tryptic peptides (see Table 2) were assigned to the major peaks in the chromatogram, as labeled in Figure 1 using the Txx convention (green labels).
In addition to enolase 1 tryptic peptides, some additional significant peaks are present in Figure 1. Most belong to tryptic peptides of the four additional proteins identified from the same database search, suggesting that yeast enolase 1 is contaminated with a significant amount of other yeast proteins. As shown in Table 1, nine, seven, six and three unique tryptic peptides were identified from enolase 2, Cu-Zn superoxide dismutase, glucose-6-phosphate isomerase and triosephosphate isomerase, respectively. Their sequences and retention times are shown in Table 3, with a given annotation. The assignments of these peptides are labeled in blue in Figure 1.
The peak intensities in Figure 1 are suggestive of protein relative abundance in the sample. MSE data and IdentityE Software provide for more rigorous quantification using a method described by Silva and colleagues.3 The relative protein concentration was determined based on the ratio of the sum of intensities of the three most abundant peptides identified from each protein (with exception of enolase 2, a homolog of enolase 1). For the quantification of enolase 2, the three most abundant peptides unique to enolase 2 were used and compared with three peptides from enolase 1 with similar sequences. Figure 2 shows relative concentrations of the proteins normalized to enolase 1; two of the protein contaminants are present at levels above 10%.
Even after the assignment of peptides originated from protein contaminants, some peaks in Figure 1 remained unknown. An additional search for peptides with no enzyme specificity was performed using a truncated database consisting of only proteins identified previously. This allows for identification of unusual protein cleavages and unexpected sequences. Indeed, several major LC peaks were identified as partially or non-tryptic peptides. Although the longest enolase 1 tryptic peptide (T21, 3736.97 Da) was not found in the LC chromatogram, series of partially-tryptic peptides (E1P9, E1P7, E1P7*, E1P7**, E1P1, E1P2) and non-tryptic peptides (E1N1, E1N1*, E1N2*) related to T21 sequence were present in the sample. Other partially/non-tryptic peptides, as listed in Table 4, were also identified with high confidence. All MSE spectra of partially- and non-tryptic peptides listed in Table 4 were validated by additional manual inspection.
When including all identified peptides, the enolase 1 sequence coverage increased from 84% to 96%. The partially tryptic and non-tryptic peptides are labeled in red in Figure 1. Figure 3 shows an example of MSE spectrum for TSPYVLPVPF, a partially tryptic peptide originating from T21 of enolase 1.
The results presented here demonstrate the advantages of a UPLC-MSE method for characterization of protein digests. Dataindependent MS acquisition in conjunction with high-resolution UPLC and specialized informatics tools allows for:
UPLC-MSE meets the requirements for a robust and flexible method needed to monitor the safety of biopharmaceutical proteins. The UPLC-MSE method has the potential to expedite recombinant protein drug development.
In conclusion, UPLC-MSE is a powerful tool that provides solutions for protein characterization challenges that are difficult to address with traditional peptide mapping and DDA-based LC-MS/MS methods.
720002809, April 2009