3. Machine learning: A data-driven approach
Fermentation is a multivariate system in which any number of involved parameters can influence the process outcome [24]. As outlined in the previous section, mechanistic models (e.g., CBM in this review) can lead to fine-tuning some fermentation parameters such as medium composition. Nevertheless, we cannot investigate the effect of all fermentation parameters on productivity through mechanistic approaches. On the other hand, strictly experimental trial-and-error methods are time-intensive and commonly high-priced. Despite the difficulties of such traditional techniques, the large amount of data generated from worthwhile previously fermentation studies provide an appropriate space for data-driven modeling approaches to find the optimal sets of fermentation parameters. Moreover, a rational analysis of large and complex datasets generated from experiments, measurements, and simulations can significantly contribute to an in-depth understanding of the system of interest [74].
Machine learning (ML) is a data-driven approach that uses statistics and probability science to analyze a dataset and discover the hidden relationships between existing data to justify a phenomenon and build a predictive model based on the patterns it learned. In the past, researchers did not distinguish between ML and artificial intelligence (AI), but nowadays, ML is recognized as a subfield of AI [75]. Actually, AI is the industry of developing tools and techniques for ML, while ML uses these tools in various fields such as engineering and science [76]. In the ML process, a problem is first defined on a dataset. Then, a set of preprocessing operations is performed on the dataset based on the defined problem. In the next step, the ML model is created by a user-defined estimator. Finally, the model is validated and evaluated by standard techniques. Figure 2 shows the general scheme of the machine learning workflow.