Measurements of protein expression and thermal stability for 114 single mutants of a glycoside hydrolase allows evaluation of stability predictions

Author contributions (alphabetical by last name)

  • Dylan Alexander Carlin [2]: molecular cloning, designed experiments, wrote software used in analysis, analyzed data, Rosetta modeling, FoldX modeling, machine learning, wrote paper
  • Ryan Caster [1]: characterized expression for mutants
  • Bill Chan [1]: characterized Tm for mutants, analyzed data, contributed to paper
  • Natalie Damrau [1]: characterized mutants
  • Siena Hapig-Ward [1]: characterized Tm and kinetic constants, analyzed data, drew figures, contributed to paper
  • Mary Riley [1]: characterized mutants
  • Justin B. Siegel [1,3,4]: PI

Author affiliations:

  1. Genome Center, University of California, Davis CA, USA
  2. Biophysics Graduate Group, University of California, Davis CA, USA
  3. Department of Chemistry, University of California, Davis CA, USA
  4. Department of Biochemistry & Molecular Medicine, University of California, Davis CA, USA

Subject areas: biochemistry, computational biology, machine learning

Keywords: enzyme, Rosetta, thermal stability



  • background

  • current data sets and problems gathering them

  • current computational approaches

  • summary of what we did

  • Figure 1: all positions selected in study


  • cloning and mutagenesis
  • production and purification
  • assay and data analysis
  • computational modeling and machine learning


  • summary of all mutants, wild type values, limits of detection

  • mutants less thermostable, mutants more thermostable

  • Figure 2: heatmap with expression, Tm, kcat, km, kcat/km for each mutant

  • Figure 3: drawings of discussed mutants W120F, N404C, H178A, E222H

  • Figure 4: crystal structure with residues colored per change in Tm compared to wild type

  • Rosetta structural features predict expression, but not Tm

  • Figure 5: machine learning model evaluation


  • summary of what we showed and connections to background
  • implications for biotechnology
  • implications for human health
  • conclusion

Supplemental information

  • data table with columns name, expression, tm, kcat, km, kcat/km and errors for those values for which that makes sense
  • gel image of each protein used in study