PROTEINS: Structure, Function, and Bioinformatics - Authorea

by author

by title

by keyword

A sequence-based foldability score combined with AlphaFold2 predictions to disentangl...

Apolline Bruley

and 3 more

August 02, 2022

Order and disorder govern protein functions, but there is a great diversity in disorder, from regions that are – and stay – fully disordered to conditional order. This diversity is still difficult to decipher even though it is encoded in the amino acid sequences. Here, we developed an analytic Python package, named pyHCA, to estimate the foldability of a protein segment from the only information of its amino acid sequence and based on a measure of its density in regular secondary structures associated with hydrophobic clusters, as defined by the Hydrophobic Cluster Analysis (HCA) approach. The tool was designed by optimizing the separation between foldable segments from databases of disorder (DisProt) and order (SCOPe (soluble domains) and OPM (transmembrane domains)). It allows to specify the ratio between order, embodied by regular secondary structures (either participating in the hydrophobic core of well-folded 3D structures or conditionally formed in intrinsically disordered regions) and disorder. We illustrated the relevance of pyHCA with several examples and applied it to the sequences of the proteomes of 21 species ranging from prokaryotes and archaea to unicellular and multicellular eukaryotes, for which structure models are provided in the AlphaFold2 databases. Cases of low-confidence scores related to disorder were distinguished from those of sequences that we identified as foldable but are still excluded from accurate modeling by AlphaFold2 due to a lack of sequence homologs or to compositional biases. Overall, our approach is complementary to AlphaFold2, providing guides to map structural innovations through evolutionary processes, at proteome and gene scales.

Characterisation of a transitionally occupied state of domain 1.1 of σA factor of RNA...

Dávid Tužinčin

and 5 more

July 28, 2022

σ factors are essential parts of bacterial RNA polymerase (RNAP) as they allow to recognize promotor sequences and initiate transcription. Domain 1.1 of vegetative σ factors occupies the primary channel of RNAP and also prevents binding of the σ factor to promoter DNA alone. Here, we show that domain 1.1 of Bacillus subtilis σ A exists in two structurally distinct variants in dynamic equilibrium. The major conformation at room temperature is represented by a previously reported well-folded structure solved by nuclear magnetic resonance (NMR), but 4% of the protein molecules are present in a less thermodynamically favorable state. We show that this population increases with temperature and may represent as much as 20% at 43.5 ◦ C. We characterized the minor state of the domain 1.1 using specialized methods of NMR. We found that, in contrast to the major state, the detected minor state is partially unfolded. Its propensity to form secondary structure elements is especially decreased for the first and third α helices, while the second α helix and β strand close to the C-terminus are more stable. In summary, this study reveals conformational dynamics of domain 1.1 and provides a basis for studies of its interaction with RNAP and effects on transcription regulation.

Counter-intuitive enhancement of degradation of solid plastic through engineering of...

Arpita Mrigwani

and 2 more

July 19, 2022

Degradation of solid polyethylene terephthalate (PET) by leaf branch compost cutinase (LCC) produces various PET-derived degradation intermediates (DIs), in addition to terephthalic acid (TPA), which is the recyclable terminal product of all PET degradation. Although DIs can also be converted into TPA, in solution, by LCC, the TPA that is obtained through enzymatic degradation of PET, in practice, is always contaminated by DIs. Here, we demonstrate that the primary reason for non-degradation of DIs into TPA in solution is the efficient binding of LCC onto the surface of solid PET. Although such binding enhances the degradation of solid PET, it depletes the surrounding solution of enzyme that could otherwise have converted DIs into TPA. To retain a sub-population of enzyme in solution that would mainly degrade DIs, we introduced mutations to reduce the hydrophobicity of areas surrounding LCC’s active site, with the express intention of reducing LCC’s binding to solid PET. Despite the consequent reduction in invasion and degradation of solid PET, overall levels of production of TPA were ~3.6-fold higher, due to the partitioning of enzyme between solid PET and the surrounding solution, and the consequent heightened production of TPA from DIs. Further, synergy between such mutated LCC (F125L/F243I LCC) and wild-type LCC resulted in even higher yields, and TPA of nearly ~100% purity.

Unveiling the biological interface of protein complexes by mass spectrometry-coupled...

Goeun Shin

and 1 more

July 18, 2022

Most biomolecules become functional and bioactive by forming protein complexes through interaction with ligands that are diverse in size, shape, and physicochemical properties. In the complex biological milieu, the interaction is ligand-specific, driven by molecular sensing and recognition of a binding interface localized within a protein structure. Mapping interfaces of protein complexes is a highly sought area of research as it delivers fundamental insights into proteomes and pathology and hence strategies for therapeutics. While X-ray crystallography and electron microscopy still serve as a gold standard for structural elucidation of protein complexes, artificial and static analytic nature thereof often results in a non-native interface that otherwise might be negligible or non-existent in biological environment. In recent years, the mass spectrometry-coupled approaches, chemical crosslinking (CLMS) and hydrogen-deuterium exchange (HDMS), have become valuable analytic complements to traditional techniques. These methods explicitly identify hot residues and motifs embedded in binding interfaces, in particular, for which the interaction is predominantly dynamic, transient, and/or caused by an intrinsically disordered domain. Here we review the principal role of CLMS and HDMS in protein structural biology with a particular emphasis on the contribution of recent examples to exploring biological interfaces. In addition, we describe recent studies that utilized these methods to expand our understanding of protein complex formation and related biological processes and to increase probability of structure-based drug design.

Understanding the helical stability of charged peptides

Nitin Kumar Singh

and 2 more

July 06, 2022

Cationic helical peptides play a crucial role in applications such as anti-microbial and anti-cancer activity. The activity of these peptides directly correlates with their helicity. In this study, we have performed extensive all-atom molecular dynamics simulations of 25 Lysine-Leucine co-polypeptide sequences of varying charge density ( λ ) and patterns. Our findings showed that an increase in the charge density on the peptide leads to a gradual decrease in the helicity up to a critical charge density λ c . Beyond, λ c a complete helix to coil transition was observed. The decrease in the helicity correlated with the increased number of water molecules in first solvation shell, solvent-exposed surface area, and a higher value of the radius of gyration of the peptide.

Tau protein misfolding and aggregation induced by abnormal N-glycosylation: Insights...

Alen Mathew T

and 4 more

July 04, 2022

Various post translational modifications like hyper phosphorylation, O-GlycNAcylation, and acetylation have been attributed to induce the abnormal folding in tau protein. Recent in vitro studies revealed the possible involvement of N–glycosylation of tau protein in the abnormal folding and tau aggregation. Hence in this study, we performed microsecond long all atom molecular dynamics simulation to gain insights into the effects of N-glycosylation on Asn-359residue which forms part of the microtubule binding region. Trajectory analysis of the stimulations coupled with essential dynamics and free energy landscape analysis suggested that tau, in its N-glycosylated form tend to exist in a largely folded conformation having high beta sheet propensity as compared to unmodified tau which exists in a large extended form with very less beta sheet propensity. Residue interaction network analysis of the lowest energy conformations further revealed that Phe378 and Lys353 are the functionally important residues in the peptide which helped in initiating the folding process and Phe378, Lys347&Lys370 helped maintaining the stability of the protein in the folded state.

Interplay between hydrogen and chalcogen bond in cysteine

Oliviero Carugo

June 29, 2022

Protein structures are stabilized by several types of chemical interactions between amino acids, which can compete with each other. This is the case of chalcogen and hydrogen bonds formed by the thiol group of cysteine, which can form three hydrogen bonds with one hydrogen acceptor and two hydrogen donors and a chalcogen bond with a nucleophile along the extension of the C-S bond. A survey of the Protein Data Bank shows that hydrogen bonds are about 40-50 more common than chalcogen bonds, suggesting that they are stronger and, consequently, prevail, though not always. It is also observed that frequently a thiol group that forms a chalcogen bond is also involved, as a hydrogen donor, in a hydrogen bond.

Surveying non-visual arrestins reveals allosteric interactions between functional sit...

James M. Seckler

and 3 more

May 23, 2022

Arrestins are important scaffolding proteins that are expressed in all vertebrate animals. They regulate cell signaling events upon binding to active G-protein coupled receptors ( GPCR) and trigger endocytosis of active GPCRs. While many of the functional sites on arrestins have been characterized, the question of how these sites interact is unanswered. We used anisotropic network modelling ( ANM) together with our covariance compliment techniques to survey all of the available structures of the non-visual arrestins to map how structural changes and protein-binding affect their structural dynamics. We found that activation and clathrin binding have a marked effect on arrestin dynamics, and that these dynamics changes are localized to a small number of distant functional sites. These sites include α-helix 1, the lariat loop, nuclear localization domain, and the C-domain β-sheets on the C-loop side. Our techniques suggest that clathrin binding and/or GPCR activation of arrestin perturb the dynamics of these sites independent of structural changes.

Insight into Substrate-assisted Catalytic Mechanism and Stereoselectivity of Bifuncti...

Ting Shi

and 3 more

May 17, 2022

The inversion from L- to D-stereochemistry endows peptides improved bioactivity and enhanced resistance to many proteases and peptidases. To strengthen the biostability and bioavailability of peptide drugs, enzymatic epimerization becomes an important way to incorporate D-amino acid into peptide backbones. Recently, a bifunctional thioesterase NocTE, which is responsible for the epimerization and hydrolysis of the C-terminal (p-hydroxyphenyl)glycine residue of β-lactam antibiotic nocardicin A, exclusively directs to the generation of D-diastereomers. Different from other epimerases, NocTE exhibits unique stereochemical selectivity. Herein, we investigated the catalytic mechanism of NocTE via molecular dynamic (MD) simulations and quantum mechanical/molecular mechanics (QM/MM) calculations. Through structural analyses, two key water molecules around the reaction site were found to serve as proton mediators in epimerization. The structural characteristics inspired us to propose a substrate-assisted mechanism for the epimerization, where multi-step proton transfers were mediated by water molecules and β-lactam ring, and the free energy barrier was calculated to be 20.3 kcal/mol. After that, the hydrolysis of D-configured substrate was energetically feasible with the energy barrier of 14.3 kcal/mol. As a comparison, the energy barrier for the direct hydrolysis of L-configured substrate was obtained to be 24.0 kcal/mol. Our study provides mechanistic insights into catalytic activities of bifunctional thioesterase NocTE, uncovers more clues to the molecular basis for stereochemical selectivity and paves the way for the directed biosynthesis of novel peptide drugs with various stereostructural characteristics by enzyme rational design.

Mutual information analysis of mutation, nonlinearity and triple interactions in prot...

Burak Erman

May 14, 2022

Mutations are the cause of several diseases as well as the underlying force of evolution. A thorough understanding of its biophysical consequences is essential. We present a computational framework for evaluating different levels of mutual information (MI) and its dependence on mutation. We used molecular dynamics trajectories of the third PDZ domain and its different mutations. MI calculated from these trajectories shows that: (i) the multivariate Gaussian distribution of joint probabilities characterizes the MI between residue pairs with sufficient accuracy. Nonlinearities in joint probabilities calculated by tensor Hermite polynomials up to the fifth order contribute insignificantly. (ii) Changes in MI between residue pairs show the characteristic patterns resulting from specific mutations. (iii) Triple correlations are characterized by evaluating MI between triplets of residues, certain triplets are strongly affected by mutation. (iv) Susceptibility of residues to perturbation are obtained by MI and discussed in terms of linear response theory.

Comprehensive Folding Variations for Protein Folding

Jiaan Yang

and 10 more

February 16, 2022

The revelation of protein folding is a challenging subject in both discovery and description. Except acquirement of accurate 3D structure for protein stable state, another big hurdle is how to discover structural flexibility for protein innate character. Even if a huge number of flexible conformations are known, difficulty is how to describe these conformations. A novel approach, protein structure fingerprint, has been developed to expose the comprehensive local folding variations, and then construct folding conformations for entire protein. The backbone of 5 amino acid residues was identified as a universal folden, and then a set of Protein Folding Shape Code (PFSC) was derived for completely covering folding space in alphabetic description. Sequentially, a database was created to collect all possible folding shapes of local folding variations for all permutation of 5 amino acids. Successively, Protein Folding Variation Matrix (PFVM) assembled all possible local folding variations along sequence for a protein, which possesses several prominent features. First, it showed the fluctuation with certain folding patterns along sequence which revealed how the protein folding was related the order of amino acids in sequence. Second, all folding variations for an entire protein can be simultaneously apprehended at a glance within PFVM. Third, all conformations can be determined by local folding variations from PFVM, so total number of conformations is no longer ambiguous for any protein. Finally, the most possible folding conformation and its 3D structure can be acquired according PFVM for protein structure prediction. Therefore, the protein structure fingerprint approach provides a significant means for investigation of protein folding problem.

Mutation in MCL1 predicted loop to helix structural transition stabilizes MCL1-Bax bi...

Deepak Shyl ES

and 3 more

January 31, 2022

Myeloid cell leukemia-1 (MCL1), an anti-apoptotic BCL-2 family protein plays a major role in the control of apoptosis as the regulator of mitochondrial permeability which is deregulated in various solid and hematological malignancies. Interaction of the executioner proteins Bak/Bax with anti-apoptotic MCL1 and its cellular composition determines the apoptotic or survival pathway. This study highlighted the deleterious MCL1-Bax stabilizing effect of the mutation V220F on MCL1 structure through computational protein-protein interaction predictions and molecular dynamics simulations. The single point mutation at V220F was selected as it is residing at the hydrophobic core region of BH3 conserved domain, the site of Bax binding. The molecular dynamics simulation studies showed increase in stability of the mutated MCL1 before and after Bax binding comparable with the native MCL1. The clusters from free energy landscape found out structural variation in folding pattern with additional helix near the BH3 domain in the mutated structure. This loop to helix structural change in the mutated complex favored stable interaction of the complex and also induced Bax conformational change. Moreover, molecular mechanics based binding free energy calculations confirmed increased affinity of Bax towards mutated MCL1. Residue-wise interaction network analysis showed the individual residues in Bax binding responsible for the change in stability and interaction due to the protein mutation. In conclusion, the overall findings from the study reveal that the presence of V220F mutation on MCL1 is responsible for the structural confirmational change leading to disruption of its biological functions which might be responsible for tumorigenesis. The mutation could possibly be used as future diagnostic markers in treating cancers.

Structural Evolution of the Ancient Enzyme, Dissimilatory Sulfite Reductase

Daniel R. Colman

and 6 more

January 29, 2022

Dissimilatory sulfite reductase is an ancient enzyme that has linked the global sulfur and carbon biogeochemical cycles since at least 3.47 Gya. While much has been learned about the phylogenetic distribution and diversity of DsrAB across environmental gradients, far less is known about the structural changes that occurred to maintain DsrAB function as the enzyme accompanied diversification of sulfate/sulfite reducing organisms (SRO) into new environments. Analyses of available crystal structures of DsrAB from Archaeoglobus fulgidus and Desulfovibrio vulgaris, representing early and late evolving lineages, respectively, show that certain features of DsrAB are structurally conserved, including active siro-heme binding motifs. Whether such structural features are conserved among DsrAB recovered from varied environments, including hot spring environments that host representatives of the earliest evolving SRO lineage (e.g., MV2-Eury), is not known. To begin to overcome these gaps in our understanding of the evolution of DsrAB, structural models from MV2.Eury were generated and evolutionary sequence co-variance analyses were conducted on a curated DsrAB database. Phylogenetically diverse DsrAB harbor many conserved functional residues including those that ligate active siro-heme(s). However, evolutionary co-variance analysis of monomeric DsrAB subunits revealed several False Positive Evolutionary Couplings (FPEC) that correspond to residues that have co-evolved despite being too spatially distant in the monomeric structure to allow for direct contact. One set of FPECs corresponds to residues that form a structural path between the two active siro-heme moieties across the interface between heterodimers, suggesting the potential for allostery or electron transfer within the enzyme complex. Other FPECs correspond to structural loops and gaps that may have been selected to stabilize enzyme function in different environments. These structural bioinformatics results suggest that DsrAB has maintained allosteric communication pathways between subunits as SRO diversified into new environments. The observations outlined here provide a framework for future biochemical and structural analyses of DsrAB to examine potential allosteric control of this enzyme.

Dynamics and recognition of homeodomain containing protein-DNA complex of IRX4

Adil Malik

and 2 more

January 20, 2022

Iroquois Homeobox 4 (IRX4) belongs to a family of homeobox TFs having roles in embryogenesis, cell specification and organ development. Recently, Large scale Genome-Wide Association studies and epigenetic studies have highlighted the role of IRX4 and its associated variants in prostate cancer. No studies have investigated and characterized the structural aspect of the IRX4 homeodomain and its potential to bind to DNA. The current study uses sequence analysis, homology modelling and molecular dynamics simulations to explore IRX4 homeodomain-DNA recognition mechanisms and the role of somatic mutations affecting these interactions. Using publicly available databases, gene expression of IRX4 was found in different tissues, including prostate, heart, skin, vagina, and the protein expression was found in cancer cell lines (HCT166, HEK293), B cells, ascitic fluid and brain. Sequence conservation of the homeodomain shed light on the importance of N- and C-terminal residues involved in DNA binding. The specificity of IRX4 homodimer bound to consensus human DNA sequence was confirmed by molecular dynamics simulations, representing the role of conserved amino acids including R145, A194, N195, S190, R198 and R199 in binding to DNA. Additional N-terminal residues like T144 and G143 were also found to have specific interactions highlighting the importance of N-terminus of the homeodomain in DNA recognition. Additionally, the effects of somatic mutations, including the conserved Arginine (R145, R198 and R199) residues on DNA binding elucidated the importance of these residues in stabilizing the protein-DNA complex. Secondary structure and hydrogen bonding analysis showed the roles of specific residues (R145, T191, A194, N195, R198 and R199) in maintaining the homogeneity of the structure and its interaction with DNA. The differences in relative binding free energies of all the mutants shed light on the structural modularity of this protein and the dynamics behind protein-DNA interaction. We also have predicted that the C-terminal sequence of the IRX4 homeodomain could act as a potential cell-penetrating peptide, emphasizing the role these small peptides could play in targeting homeobox TFs.

Protein folding and unfolding: proline cis - trans isomerization at the c subunits of...

Salvatore Nesci

January 12, 2022

The c subunits, which constitutes the c-ring apparatus of the F 1F O-ATPase, could be the main components of the mitochondrial permeability transition pore (mPTP). The well-known modulator of the mPTP formation and opening is the cyclophilin D (CyPD), a peptidyl-prolyl cis- trans isomerase. On the loop, which connects the two hairpin α-helix of c subunit, is present the unique proline residue (Pro 40) that could be a biological target of CyPD. Indeed, the proline cis- trans isomerization might provide the switch that interconverts the open/closed states of the pore by pulling out the c-ring lipid plug.

Heterologous Overexpression of Sup35 in E. coli Leads to Both Monomer and Complex Sta...

Mingyang Wang

and 2 more

December 31, 2021

The heterologous overexpression states of prion proteins play a critical role in understanding the mechanisms of prion-related diseases. We report herein the identification of soluble monomer and complex states for a bakers’ yeast prion, Sup35, when expressed in E. coli. Two peaks are apparent with the elution of His-tagged Sup35 by imidazole from a Ni 2+ affinity column. Peak I contains Sup35 in both monomer and aggregated states. Sup35 aggregate is abbreviated as C-aggregate and includes a non-fibril complex comprising Sup35 aggregate-HSP90-Dna K, ATP synthase β unit (chain D), 30S ribosome subunit, and Omp F. The purified monomer and C-aggregate can remain stable for an extended period of time. Peak II contains Sup35 also in both monomer and aggregated (abbreviated as S-aggregate) states, but the aggregated states are caused by the formation of inter-Sup35 disulfide bonds. This study demonstrates that further assembly of Sup35 non-fibril C-aggregate can be interrupted by the chaperone repertoire system in E. coli.

Plastics degradation by hydrolytic enzymes: the Plastics-Active Enzymes Database - PA...

Patrick Buchholz

and 6 more

December 11, 2021

Petroleum based plastics are durable and accumulate in all ecological niches. Knowledge on enzymatic degradation is sparse. Today, less than 50 verified plastics-active enzymes are known. First examples of enzymes acting on the polymers polyethylene terephthalate (PET) and polyurethane (PUR) have been reported together with a detailed biochemical and structural description. Further, very few polyamide (PA) oligomer active enzymes are known. In this paper, the current known enzymes acting on the synthetic polymers PET and PUR are briefly summarized, their published activity data were collected and integrated into a comprehensive open access database. The Plastics-Active Enzymes Database (PAZy) represents an inventory of known and experimentally verified plastics-active enzymes. Almost 3000 homologues of PET-active enzymes were identified by profile hidden Markov models. Over 2000 homologues of PUR-active enzymes were identified by BLAST. Based on multiple sequence alignments, conservation analysis identified the most conserved amino acids, and sequence motifs for PET- and PUR-active enzymes were derived.

IGF-dependent dynamic modulation of a protease cleavage site in the intrinsically dis...

Garima Jaipuria

and 10 more

November 24, 2021

Functional regulation via conformational dynamics is well known in structured proteins, but less well characterized in intrinsically disordered proteins and their complexes. Using NMR spectroscopy we have identified a dynamic regulatory mechanism in the human insulin-like growth factor (IGF) system involving the central, intrinsically disordered linker domain of human IGF-binding protein-2 ( hIGFBP2). The bioavailability of IGFs is regulated by the proteolysis of IGF-binding proteins. In the case of hIGFBP2, the linker domain (L- hIGFBP2) retains its intrinsic disorder upon binding IGF-1 but its dynamics are significantly altered, both in the IGF binding region and distantly located protease cleavage sites. The increase in flexibility of the linker domain upon IGF-1 binding may explain the IGF-dependent modulation of proteolysis of IGFBP2 in this domain. As IGF homeostasis is important for cell growth and function, and its dysregulation is a key contributor to several cancers, our findings open up new avenues for the design of IGFBP analogs inhibiting IGF-dependent tumors.