INTRODUCTION
Spike is a trimeric surface glycoprotein from the SARS-CoV-2 virus causing the COVID-19 pandemic (1, 2). Spike’s binding to human angiotensin-converting enzyme 2 (ACE2) is critical for SARS-CoV-2 penetration of a host cell and initiation of infection (3, 4, 5). Spike is the surface antigen and target of all currently available COVID-19 vaccines (6), and improved understanding of spike’s structures and functions is likely to result in more effective SARS-CoV-2 vaccines (7).
Spike trimers engage in several dynamic structural changes related to binding ACE2, cleavage by protease TMPRSS2, unfolding their S2 domains, and finally re-folding to pull viral and target cell membranes into juxtaposition (8). Spike’s structural re-arrangements are required steps during SARS-CoV-2 infection (9) and are also potential mechanistic targets for spike-neutralizing reagents such as the host’s antibodies (10, 11). A well-established methodology to measure changes in (glyco)protein higher-order structure is hydrogen/deuterium exchange mass spectrometry (HDX-MS) (12). Previous publications have described HDX-MS analyses of spike, including its interaction with ACE2 (4, 5, 13, 14) and changes in structural dynamics specific to different variants of concern, including Alpha, Beta, Delta, and Omicron (1, 13, 15, 16).
HDX-MS provides a framework of sample preparation, proteolytic digestion, peptide identification, and deuterium uptake measurement over time to identify changes in (glyco)protein conformation (12, 17). In an HDX-MS analysis comparing two states of the (glyco)protein of interest, changes in a peptide’s amide backbone hydrogen exchange are interpreted as indications of movement or stabilization of α-helices, β-sheets, and other hydrogen bonds contributing to secondary structure (18). Solvent accessibility also plays a role in the deuterium labeling of proteins (19).
An important goal of HDX-MS analysis is maximizing sequence coverage of the (glyco)protein of interest, since gaps could include regions with informative deuterium labeling. For example, localization of a monoclonal antibody’s epitope (20) on an antigenic (glyco)protein of interest can be challenging if sequence coverage of the antigenic (glyco)protein such as spike (16, 21) is incomplete (22). A key step in modern HDX-MS analysis for obtaining sequence coverage is on-line proteolytic digestion (23) of the (glyco)protein of interest to generate peptides amenable to liquid chromatography (LC)/MS detection. HDX-MS sample preparation typically includes a low pH (~2.5) quenching step immediately before proteolytic digestion (12), limiting digestion to acid-tolerant proteases such as pepsin (24) or aspergillopepsin (23, 25). These proteases generate overlapping, mostly non-specific (26) peptides that are nonetheless reproducible for a given protein substrate. Previous HDX-MS analyses of spike have shown sequence coverage gaps when using on-line digestion with pepsin (1, 4, 5, 13).
Specifically considering glycoproteins, sequence coverage gaps in HDX-MS analyses are often associated with N-glycosylation “sequons” (the amino acid sequence Asn-Xaa-Ser/Thr, where Xaa is not Pro and a glycan portion composed of 2 to 11+ hexose subunits is covalently bound to the Asn residue) (27). The reasons for these coverage gaps at N-glycosylation sequons potentially include 1) stearic inhibition of the on-line protease’s cleavage by the bulky glycan group covalently bound to an Asn residue (28), resulting in fewer short peptides containing the sequon, and 2) lack of detection of the resulting high-mass glycopeptides during subsequent LC/MS and LC/tandem mass spectrometry (LC/MS/MS) analyses. Although several previous HDX-MS analyses of spike (4, 15) or IgG (29) have detected peptides containing the amino acid sequence Asn-Xaa-Ser/Thr, these are not bona fide“glycopeptides” because the mass(es) and possible identity(ies) of any covalently bound glycan group(s) were not specified. Some recent HDX-MS publications (1, 14) report glycopeptide data from a separate (non-HDX) analysis but this does not provide information about the deuteration of peptides with covalently bound glycans.
Glycan identity is an important aspect of glycoprotein analysis because microheterogeneity (the cohort of all the different glycan structures bound to a particular N-glycosylation sequon, (30)) influences glycoprotein structures and functions (31). The impact of microheterogeneity on HDX-MS analyses of glycoproteins is not presently known because detecting and identifying glycopeptides with their glycans still covalently bound requires advanced MS/MS methods and appropriate data processing for detection and assignment of glycans (32, 33, 34, 35).
SARS-CoV-2 Spike glycoprotein has 22 N-glycosylation sequons per monomer and several publications describe spike’s microheterogeneity (32, 36, 37, 38). Spike’s high level of glycosylation has caused significant gaps in sequence coverage and incomplete HDX-MS data in previous studies (1, 4, 5, 13). During our HDX-MS analyses of spike we applied a previously described method for detecting glycopeptides (39, 40) to the deuterium-labeled D614G variant (41). We believe this is the first report directly measuring the deuteration of peptides with N-glycosylation sequons and covalently bound glycan groups to determine the impact of microheterogeneity on the HDX-MS dynamics of SARS-CoV-2 spike. Heat-treatment of spike was used to significantly change protein structure and demonstrate the utility of deuterated glycopeptide data to improve HDX-MS conformational analysis of glycoproteins.