Figure 2. The basic concept of probability P . Black dots is known
data, red dots are unknown data, the black line is predicted values by
the GPR method, and the blue lines are normal distributions for unknown
data.
Proposed method
We propose a process design method based on Bayesian optimization, as
shown in the outline in Figure 3. First, the EO process is designed, and
one million candidates of X are generated by random numbers. Then, 50
candidates of X are selected by using D-optimal design, and the EO plant
is simulated with those candidates, yielding Y values. The GPR model is
constructed between X and Y with 50 samples, and P defined by Eq.
(12) is calculated using output from the model. Because there are
multiple Y variables, the acquisition function is the product ofP for all Y variables. However, multiplying values less than one
will result in smaller values; therefore, the acquisition function is
the sum of log(P ) values. In addition, because the target ranges
of Y are different, the acquisition function is scaled by using the
probabilities in the target range of the GPR model samples
[P (x train)]. This acquisition
function P all is calculated as follows:
\(P_{\text{all}}\left(\mathbf{x}_{\text{new}}\left(i\right)\right)=\log\left(\frac{{P\left(\mathbf{x}\right)}_{1}-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{1}\right)}{\max\left(P\left(\mathbf{x}_{\text{train}}\right)_{1}\right)-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{1}\right)}\right)+\cdots+\log\left(\frac{{P(\mathbf{x})}_{k}-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{k}\right)}{\max\left(P\left(\mathbf{x}_{\text{train}}\right)_{k}\right)-min\left(P\left(\mathbf{x}_{\text{train}}\right)_{k}\right)}\right)\)(13)
where k is the number of Y. The EO plant is simulated with
candidates of X having the highest values ofP all, and Y values are obtained. If the Y values
do not achieve the targets, the GPR model is updated using simulation
results, and this flow is repeated until all Y values are achieved
within the targets.