Figure 5: Elbow method for choosing the number of clusters with k-means
clustering (left hand side); median and slope plot with resulting
clusters (right hand side).
Implementation
As shown in Figure 2, the distillation PEA is equipped with a
PLC as a PEA internal control unit. An OPC UA server is located on this
PLC, on which all process variables are published relevant for the
process. The modular concept with OPC UA provides a standardized server
interface, on which the ML algorithm can be adapted. The ML-tool
developed in Python has an OPC UA client taken from the Python package
freeopcua , which reads the process variables every second and processes
them into a data frame. This data frame can be processed further as
shown above. Therefore, the data is pre-processed according to chapter
3.1 and the developed models are applied to the data frame. As output
the models provide on the one hand the prediction of the pressure for
the next 20 seconds. On the other hand, the current operating status is
classified from the prediction. The structure of the ML forecast
implementation as well as the results plotted in a real time diagram are
shown in Figure 2.
The diagram shows the curve of the pressure difference, the filtered
pressure curve and the prediction of future pressure difference. The
current operating status is displayed above the diagram as text, which
informs the operator if flooding occurs.
Distillation experiments and
optimization
As validation data, a test procedure is carried out, in which the column
is flooded several times. Care is taken to ensure that the flooding is
generated by various parameter changes in order to check whether the
influence of all parameters on the flooding behavior is reliably mapped.
It could be shown that all three trained algorithms (gradient boost,
extra trees, AdaBoost + extra trees) are able to detect and reliably
display the flooding behavior. The accuracies (coefficient of
determination) of the different models with respect to the validation
data are R²gradient boost = 0.878,
R²extra trees = 0.853 and R²AdaBoost +
extra trees = 0.857.
The results of the flooding detection for the trained and selected
models are shown in Figure 7. It can be seen that all three models have
similar accuracies, but the resulting prediction curves show significant
differences. The prediction of the combined model of AdaBoost and extra
trees regression shows a strongly fluctuating behavior, which makes it
difficult to evaluate the forecast. The pure regression by extra trees
is much smoother, but the prediction is too flat and has problems to
follow the current pressure curve. In contrast, the Gradient Boost model
shows a significantly more reactivated response with sufficient
smoothing of the prediction curve, which makes this model the most
suitable solution out of the three models investigated.