- (add anything from the above section we didn't finish in time for this report submission)
- Evotuning loss does not correlate with top model fitting performance (note: top model fitting performance does not necessarily correlate with DE performance as the landscapes of our test and train sets are NOT going to be the same as when we have multiple mutants). TODO: Run a lot of these trials on a range of evotuned weights for 2MS2, evaluate performance on DOUBLE MUTANT DATA (as the test set & box whisker plot generating).
- try retraining weights from scratch with jax-unirep - see if it makes a difference
- further optimize the jax-unirep model, right now we're training with non-fixed batch sizes which could be optimized.
- the mLSTM could be replaced by a transformer!
- Most importantly, we need to do wet lab testing of our proposed mutant sequences.
- add inserts and deletes as possible mutations to directed evolution script. This happens all the times in evolution, why shouldn't they be simulated as part of our algorithm? This would be a simple change, but is impossible to validate the impact of adding this feature without experimental testing.
- Conclusion: we believe the biggest contribution of this work is in our clean, open-source pipeline, so we hope researchers will find this helpful and want to use it in their work. If you want to see anything modified / added please reach out!
Software Repository