Protein glycosylation is increasingly recognized as a common protein modification across bacterial species. Within the Neisseria genus O-linked protein glycosylation is conserved yet closely related Neisseria species express O-oligosaccharyltransferases (PglOs) with distinct targeting activities. Within this work, we explore the targeting capacity of different PglOs using Field Asymmetric Waveform Ion Mobility Spectrometry (FAIMS) fractionation and Data-Independent Acquisition (DIA) to allow the characterization of the impact of changes in glycosylation on the proteome of N. gonorrhoeae. We demonstrate FAIMS expands the known glycoproteome of wild type N. gonorrhoeae MS11 and enables differences in glycosylation to be assessed across strains expressing different pglO allelic chimeras with unique substrate targeting activities. Combining glycoproteomic insights with DIA proteomics, we demonstrate that alterations within pglO alleles have widespread impacts on the proteome of N. gonorrhoeae. Examination of peptides known to be targeted by glycosylation using DIA analysis supports alterations in glycosylation occupancy occurs independently of changes in protein levels and that the occupancy of glycosylation is generally low on most glycoproteins. This work thus expands our understanding of the N. gonorrhoeae glycoproteome and the roles that pglO allelic variation may play in governing genus-level protein glycosylation.
State-of-the-art mass spectrometers combined with modern bioinformatics algorithms for peptide-to-spectrum matching (PSM) with robust statistical scoring allow for more variable features (i.e., post-translational modifications) being reliably identified from (tandem-) mass spectrometry data, often without the need for biochemical enrichment. Semi-specific proteome searches, that enforces a theoretical enzymatic digestion to solely the N- or C-terminal end, allow to identify native protein termini or those arising from endogenous proteolytic activity (also referred to ‘neo-N-termini’ analysis or ‘N-terminomics’. Nevertheless, deriving biological meaning from these search outputs can be challenging in terms of data mining and analysis. Thus, we introduce Fragterminomics, a data analysis approach for the (1) annotation of peptides according to their enzymatic cleavage specificity, (2) differential abundance and enrichment analysis of N-terminal sequence patterns, (3) visualization of neo-N-termini location, and (4) mapping neo-N-termini to known protein processing features. We illustrate the use of Fragterminomics by applying it to tandem mass tag (TMT)-based proteomics data of a mouse model of polycystic kidney disease and assess the semi-specific searches for biological interpretation of cleavage events and the variable contribution of proteolytic products to general protein abundance. The Fragterminomics approach and example data are available as an R package at https://github.com/MiguelCos/Fragterminomics.
The tooth is one of the ideal models for developmental study, involving in epithelial-mesenchymal transition and cell differentiation. The essential factors and pathways identified in tooth development will help understand the natural development process and the malformations of mineralized tissues such as skeleton. The time-dependent proteomic changes were investigated by healthy human molars proteomics of embryonic stages from the cap-to-early bell stage. A total of 713 differentially expressed proteins (DEPs) with five temporal expression patterns were filtered. 24 potential driver proteins of tooth development were screened by weighted gene co-expression network analysis (WGCNA) including CHID1, RAP1GDS1, HAPLN3, AKAP12, WLS, GSS, DDAH1, CLSTN1, AFM, RBP1, AGO1, SET, HMGB2, HMGB1, ANP32A, SPON1, FREM1, C8B, PRPS2, FCHO2, PPP1R12A, GPALPP1, U2AF2 and RCC2. The hub proteins in different temporal expression patterns were extracted. And the potential cell resources and the temporal expression patterns at transcriptomic level were explored using single cell RNA-sequencing (scRNA-seq). This study provides invaluable resources for the mechanistic studies of human embryonic epithelial and mesenchymal cell differentiation and tooth development.
The mzIdentML file format, originally developed by the Proteomics Standards Initiative in 2011, is the open XML data standard for peptide and protein identification results coming from mass spectrometry. We present mzIdentML version 1.3.0, which introduces new functionality and support for additional use cases. First of all, a new mechanism for encoding identifications based on multiple spectra. Furthermore, the main mzIdentML specification document can now be supplemented by extension documents which provide further guidance for encoding specific use cases for different proteomics subfields. One extension document has been added, covering additional use cases for the encoding of crosslinked peptide identifications. The ability to add extension documents facilitates keeping the mzIdentML standard up to date with advances in the proteomics field, without having to change the main specification document. The crosslinking extension document provides further explanation of the crosslinking use cases already supported in mzIdentML version 1.2.0, and provides support for encoding additional scenarios that are critical to reflect developments in the crosslinking field and facilitate its integration in structural biology. These are: (i) support for cleavable crosslinkers, (ii) support for internally linked peptides, (iii) support for noncovalently associated peptides, and (iv) improved support for encoding scores and the corresponding thresholds.
Seeds are an important part of plants, ensuring the continuation of plants’ life and providing nutrient reserves for humans and animals. Seed development is controlled by the interplay of several physiological processes. We applied label-free proteomics to round and wrinkled peas using seeds sampled at five growth stages (4 days after anthesis (DAA), 7DAA, 12DAA, 15DAA, and maturity). Phenotypic results indicated that wrinkled peas had lower starch concentration compared to round peas (29.5% vs. 46.6-55.1%). A total of 4,126 high confident proteins were detected, with 22–26% shared across all sampling times within an entry. Early seed growth stages were characterized by more unique proteins compared to maturity. Two-way ANOVA revealed 1,685 proteins significantly different among samples, of which 722 proteins were characterized into 29 functional classes. The four major classes (comprising over 50 proteins) were protein biosynthesis, protein homeostasis, enzymes, and carbohydrate metabolism. Of the two types of comparisons (time-point and entry-wise), time-point comparisons yielded more differentially abundance proteins (596 proteins in total). Different protein classes exhibited different patterns of change during seed development. For example, cell division related proteins were abundant early in seed development, whereas storage proteins were abundant later in seed development (especially after 12DAA). Compared to the round pea entries, the wrinkled entry had significantly lower abundance of starch branching enzymes, a protein involved in the biosynthesis of amylopectin in starch. In conclusion, the results of this study provide valuable information to improve our understanding of seed development and form the basis for further studies.
Immunotherapy harnesses neoantigens encoded within the human genome, but their therapeutic potential is hampered by low expression, which may be controlled by the Nonsense-Mediated Decay (NMD) pathway. This study investigates the impact of UPF1-knockdown on the expression of non-canonical/mutant proteins, employing proteogenomic to explore UPF1 role within the NMD pathway. Additionally, we conducted a comprehensive pan-cancer analysis of UPF1 expression and evaluated UPF1 expression in Triple-Negative Breast Cancer (TNBC) tissue in-vivo. Our findings reveal that UPF1-knockdown leads to increased transcription of non-canonical/mutant proteins, particularly those originating from retained-introns, pseudogenes, long non-coding RNAs, and unannotated transcript biotypes. Moreover, our analysis demonstrates elevated UPF1 expression in various cancer types, with notably heightened protein levels in patient-derived TNBC tumours compared to adjacent tissues. This study elucidates UPF1 role in mitigating transcriptional noise by degrading transcripts encoding non-canonical/mutant proteins. Intriguingly, we observe an upregulation of the NMD pathway in cancer, potentially acting as a “neoantigen-masking” mechanism that suppresses non-canonical/mutant protein expression. Targeting this mechanism may reveal a new spectrum of neoantigens accessible to the antigen presentation pathway. Our novel findings provide a strong foundation for the development of therapeutic strategies aimed at targeting UPF1 or modulating the NMD pathway.
Endometrial cancer is the most prevalent gynaecological cancer globally. Its association with obesity and metabolic diseases is a key aetiology, increasingly among younger females. Early diagnosis and improved treatment decisions are crucial for these women whose outcomes could be improved by discovering new biomarkers. We took a new approach to extracellular vesicle (EV) biomarker discovery - profiling the proteome of enriched EVs isolated directly from frozen biobanked endometrial cancers. Nine tissue pools, each generating collagenase-digested tissue and matched small EVs, were analysed using label-free proteomics. Three clinical subgroups: Endometrioid low BMI (body mass index), Endometrioid high BMI, and Serous, irrespective of BMI, were compared to identify shared secreted proteins, proteins associated with histological subtype, and proteins related to BMI. EVs were enriched for common EV markers and large secreted proteins. Cell lysates were enriched in mitochondrial and blood proteins. EV protein profiles were most different between the high BMI subgroup and the others, highlighting a significant influence of comorbidities on the intra-tumoural EV secretome. Proteins differentially abundant between subgroups in tissues were strikingly not also differential in the matched EVs. This work has identified secreted proteins implicated in the complex pathophysiology of endometrial cancer and pinpointed candidate biomarkers for diagnosis.
The ability of trophectodermal cells (outer layer of the embryo) to attach to the endometrial cells and subsequently invade the underlying matrix are critical stages of embryo implantation during successful pregnancy establishment. Extracellular vesicles (EVs) have been implicated in embryo-maternal crosstalk, capable of reprogramming endometrial cells towards a pro-implantation signature and phenotype. However, challenges associated with EV yield and direct loading of biomolecules limit their therapeutic potential. We have previously established generation of cell-derived nanovesicles (NVs) from human trophectodermal cells (hTSCs) and their capacity to reprogram endometrial cells to enhance adhesion and blastocyst outgrowth. Here, we employed a rapid NV loading strategy to encapsulate potent implantation molecules such as HB-EGF (NVHBEGF). We show these loaded NVs elicit EGFR-mediated effects in recipient endometrial cells, activating kinase phosphorylation sites that modulate their activity (AKT S124/129, MAPK1 T185/Y187), and downstream signalling pathways and processes (AKT signal transduction, GTPase activity). Importantly, they enhanced target cell attachment and invasion. The phosphoproteomics and proteomics approach highlight NVHBEGF-mediated short-term signalling patterns and long-term reprogramming capabilities on endometrial cells which functionally enhance trophectodermal-endometrial interactions. This proof-of-concept study demonstrate feasibility in enhancing the potency of NVs in the context of embryo attachment and establishment.
For the ex-situ conservation of giant pandas, both collecting and preserving semen are important methods. The seminal plasma is rich in nutrients and bioactive substances, such as proteins, carbohydrates, lipids, amino acids, and hormones, which play an important role in the reproduction and reproductive health of the species. This is the first study to analyze the seminal plasma proteins of giant pandas through proteomics and identified 1125 proteins. These proteins are related to protein turnover, translation, and metabolism. The seminal plasma proteins of giant pandas were then compared to those of humans, pigs and sheep, with many unique proteins found in giant panda samples. Among these proteins, the WD40 repeat-containing proteins have been identified and implicated in sperm function and fertility. Understanding the composition and function of proteins in the giant panda seminal plasma proteome can provide valuable insights into their reproductive biology and help develop strategies to improve their reproductive success in captivity, which is essential for giant panda conservation.
Cancer remains one of the most complex and challenging diseases in mankind. To address the need for a personalized treatment approach for particularly complex tumor cases, molecular tumor boards (MTBs) have been initiated. MTBs are interdisciplinary teams that perform in-depth molecular diagnostics to cooperatively and interdisciplinarily advise on the best therapeutic strategy. Current routine molecular diagnostics are routinely performed on the transcriptomic and genomic levels, aiming for the identification of tumor-driving mutations. However, these approaches can only partially capture the actual phenotype as well as the molecular key players of tumor growth and progression. Thus, direct investigation of the expressed proteins and activated signaling pathways provide complementary information on the tumor-driving molecular characteristics of the tissue. Technological advancements in mass-spectrometry-based proteomics enable the robust, rapid, and sensitive detection of thousands of proteins in minimal sample amounts, paving the way for clinical proteomics and the probing of oncogenic signaling activity. Therefore, proteomics is currently being integrated into molecular diagnostics within MTBs and holds promising potential in aiding tumor classification and identifying personalized treatment strategies. This review gives an introduction to MTBs and describes current state-of-the-art clinical proteomics, its potential in precision oncology, and highlights the benefits of multi-omic data integration.
High-throughput proteomics is an effective methodology for identifying a variety of virulence factors of pathogens. Proteomic data are commonly evaluated against annotated sequences present in publicly available database repositories. A proteogenomic approach can be used if annotated sequences are not available or to identify novel proteins/peptides. However, a single genome is commonly utilized in proteomic and proteogenomic analyses. We pose the question of whether utilizing a number of different genome assemblies of a bacterial pathogen would be beneficial. Here, we used previously obtained shot-gun label-free nano-LC‒MS/MS data of the exoprotein fraction of four reference ERIC I–IV genotypes of Paenibacillus larvae and evaluated them against publicly available annotated sequences (from NCBI-protein, RefSeq, UniProt) together with an array of protein sequences generated using a six-frame direct translation of 15 genomic assemblies available in GenBank. The wide search through 18 database components reliably identified 453 protein hits. UpSet analysis categorized the hits into 50 groups based on the success protein identification by databases. The relatively high variability in successful identification among the genome assemblies facilitated the mining of markers based on uniqueness and contrasting results prior to considering proteome differences. Data evaluation provided novel and interesting markers that can be studied further.
High grade gliomas (HGGs), are the most malignant and difficult to treat brain tumors. Despite several studies on glioma pathobiology there is no comparative proteomics study on high-grade and low-grade gliomas which uncovers the mechanism behind the aggressive mesenchymal behaviour of HGGs. In this study, tissue samples of high-grade and low-grade gliomas were processed for label free quantification (LFQ) using HR-LC MS/MS. The analysis identified 140 differentially expressed proteins, GSEA and protein-protein interaction analysis showed over expression of pathways like; ECM remodelling, Focal Adhesion, EMT and Glycan Biosynthesis in HGG. The key proteins were validated using multiple reaction monitoring experiment. ECM glycoproteins including; Fibronectin, Fibrinogens, Collagens, Vitronectin along with mesenchymal markers such as Vimentin and TGF-β came over-expressed in HGGs. The over-expression of oligosaccharyltransferase in HGG indicates its role in enhanced expression of glycoproteins. In-silico molecular docking with catalytic subunits of OST identified two small molecule inhibitors; Irinotecan and Entrectinib as potential candidates to target OST. We propose OST plays a major role in tumor metastasis by promoting EMT and could be used as a potential target to suppress glioma metastasis. Finally, the proteins identified in this study need further clinical research to validate their prognostic values as protein markers.
Enzymatic catalysis is one of the fundamental processes that drives the dynamic landscape of post-translational modifications (PTMs), expanding the structural and functional diversity of proteins. Here, we assessed enzyme specificity using a top-down ion mobility spectrometry (IMS) and tandem mass spectrometry (MS/MS) workflow. We successfully applied trapped IMS (TIMS) to investigate site-specific N-ε-acetylation of lysine residues of full-length histone H4 catalyzed by histone lysine acetyltransferase KAT8. We demonstrate that KAT8 exhibits a preference for N-ε-actylation of residue K16, while also installing N-ε-acetyl groups on residues K5 and K8 as the first degree of acetylation. Achieving TIMS resolving power values of up to 300, we fully separated mono-acetylated regioisomers (H4K5ac, H4K8ac, and H4K16ac). Each of these regioisomers produce unique MS/MS fragment ions, enabling estimation of their individual mobility distributions and the exact localization of the N-ε-acetylation sites. This study highlights the potential of top-down TIMS-MS/MS for conducting enzymatic assays at the intact protein level and, more generally, for separation and identification of isomeric proteoforms and precise PTM localization.
MALDI mass spectrometry imaging (MALDI imaging) is uniquely suited to advance cancer research by measuring spatial distribution of endogenous and exogenous molecules directly from thin tissue sections. These molecular maps provide valuable insights into various aspects of basic and translational cancer research, including spatial tumor and tumor microenvironment biology, pharmacological interventions, and patient stratification. However, despite these advantages, the utilization of MALDI imaging in studying rare cancers, which comprise approximately 20% of all cancers, remains limited. Rare cancers pose unique challenges in medical research, resulting in understudied entities with suboptimal management and outcomes. In this review, we explore the value of MALDI imaging in sarcoma, as an example of a highly heterogeneous and challenging rare cancer. We summarize existing MALDI imaging studies in sarcoma and outline potential future applications. In addition, we address the specific challenges encountered when employing MALDI imaging to rare cancers, and propose solutions, including the utilization of formalin-fixed paraffin-embedded tissues, multi-site studies, implementation of multiplexed experiments, and considerations for data sharing practices. Through this review, we aim to inspire collaboration between MALDI imaging researchers and clinical colleagues, to deploy the unique capabilities of MALDI imaging in rare cancer research, particularly in the context of sarcoma.
Changes in the structure of biological macromolecules, such as RNA and protein, have an important impact on biological functions, and are even important determinants of disease pathogenesis and treatment. Some genetic variations, including copy number variation, single nucleotide variation, and so on, can lead to changes in biological function and increased susceptibility to certain diseases by changing the structure of biological macromolecules. Here, we reviewed the progress of research about the effects of genetic variation on the structure of macromolecules including RNAs and proteins, several typical methods and common tools, and the effect on several diseases. An online resource (http://www.onethird-lab.com/gems/) to support convenient retrieval of common tools is also built. Finally, the challenges and future development of effect prediction were discussed.
The group 2 σ factor for RNA polymerase SigE plays important role in regulating central carbon metabolism in cyanobacteria. However, the regulation of SigE for these pathways at a proteome level remains unknown. Using a sigE-deficient strain (ΔsigE) of Synechocystis sp. PCC 6803 and quantitative proteomics, we found that SigE depletion induces differential protein expression for sugar catabolic pathways including glycolysis, oxidative pentose phosphate (OPP) pathway, and glycogen catabolism. Two glycogen debranching enzyme homologues Slr1857 and Slr0237 are found differentially expressed in ΔsigE. Glycogen determination indicated that Δslr0237 accumulated glycogen under photomixotrophic conditions but was unable to utilize these reserves in the dark, whereas Δslr1857 accumulates and utilize glycogen in a similar way as the WT strain does in the same conditions. These results suggest that Slr0237 plays the major role as the glycogen debranching enzyme in Synechocystis. To our knowledge, this is the first study to report the functional difference of two glycogen debranching enzyme in Synechocystis and the research highlights the intricate regulation of glycogen breakdown.