Discussion (needs a complete rewrite)

Summary of our findings

Here, we report the thermal stability and soluble protein expression for 120 mutants of BglB. This data set was enabled by the use of high-throughout roboticly automated molecular biology tools for site-directed mutagenesis as well as open-sourced laboratory protocols. We described the use of this data set to train machine learning algorithms to develop a classifier that was able to predict the soluble protein expresion with an accuracet of X, and the thermal melting temperature with a PCC of X.

Implications of our findings to biotechnology

In biotechnology applications, it is enough to generate hypotheses with a distrubution around the correct mutaion, as long as these mutations are accurate enough so that they are better than random. - in biotechnology, it is OK to have mild predictive power, as large numbers of mutations are tried. Still, the best we can do is better than current approaches X, Y, and Z. Describe how what we did is better.

Implications of our findings to human health

In the prediction of disease state from genomic sequcne, of relevance to human health, it is advisable to have the fewest number of false positives, which create unnecessary cost and scare patients, as well as low false negatives, which result in important diagnose to be missed. Unfortunately, our current approach falls short, but we propose that this approch could be expanded in the future for more accurate predictions, through larger data sets. And improved predictive modeling.