Alex Carlin edited Introduction_Protein_stability_data_is__.md  almost 8 years ago

Commit id: 0a85f8a2164cd0ee694d0060bc2bcd03b0e16537

deletions | additions      

       

We have previously shown the ability to predict the effects of single point mutations on enzyme kinetic parameters kcat, km, and kcat/km using a combination of molecular models and machine learning. Here, we expand on our previous work by characterizing the functional denaturation of 120 single point mutations of B-glucosidase B (BglB), a family 1 glycoside hydrolase.   To understand how an enzyme functions it is necessary to tally the contribution of each residue in the sequence to a variety of functional parameters. Questions like, is it true that most residues in the protein do not contribute to catalysis, can only be answered by obtaining data on the functional effects of those mutations. Each residue plays a role in determining the enzyme’s functional parameters, as well as contributing to protein stability. However, experimentally determining all the possible single point mutations alone (over 8880), is an endeavour too costly to undertake for any ordinary enzyme, let alone the hundreds of thousands with known crystal structures for which these determinations would be feasible. Therefore, it will be necessary to build computational models that allow us to predict the effect of enzyme mutations on an enzyme’s binding affinity for a particular ligand as well as its efficiency in transforming that substrate into a product, as well as overall catalytic efficiency (kcat/KM).   Furthermore, a large number of human diseases are caused by missense mutations to crucial metabolic enzymes that result in non-functional protein. The prediciton of the effect of a single point mutation would thus be useful in clinical medicine for the diagnosis of disease.   Here, we produce and purify 120 mutants of BglB and assess soluble expression for each protein. For those that are expressed and have enzymatic activity above our assay’s limit of detection (1 turnover per min), we determine melting temperature based on activity assay (64 mutants). We use structural features from Rosetta to train an SVM classifier and achieve prediction accuracy of 0.84 (PCC), allowing comparision to other machine-learning based algorithms for predicting protein thermal stability.