Figure 2. The basic concept of probability P . Black dots is known data, red dots are unknown data, the black line is predicted values by the GPR method, and the blue lines are normal distributions for unknown data.
Proposed method
We propose a process design method based on Bayesian optimization, as shown in the outline in Figure 3. First, the EO process is designed, and one million candidates of X are generated by random numbers. Then, 50 candidates of X are selected by using D-optimal design, and the EO plant is simulated with those candidates, yielding Y values. The GPR model is constructed between X and Y with 50 samples, and P defined by Eq. (12) is calculated using output from the model. Because there are multiple Y variables, the acquisition function is the product ofP for all Y variables. However, multiplying values less than one will result in smaller values; therefore, the acquisition function is the sum of log(P ) values. In addition, because the target ranges of Y are different, the acquisition function is scaled by using the probabilities in the target range of the GPR model samples [P (x train)]. This acquisition function P all is calculated as follows:
\(P_{\text{all}}\left(\mathbf{x}_{\text{new}}\left(i\right)\right)=\log\left(\frac{{P\left(\mathbf{x}\right)}_{1}-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{1}\right)}{\max\left(P\left(\mathbf{x}_{\text{train}}\right)_{1}\right)-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{1}\right)}\right)+\cdots+\log\left(\frac{{P(\mathbf{x})}_{k}-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{k}\right)}{\max\left(P\left(\mathbf{x}_{\text{train}}\right)_{k}\right)-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{k}\right)}\right)\)(13)
where k is the number of Y. The EO plant is simulated with candidates of X having the highest values ofP all, and Y values are obtained. If the Y values do not achieve the targets, the GPR model is updated using simulation results, and this flow is repeated until all Y values are achieved within the targets.