Final prediction (top) and output states (bottom) for the training algorithm (on a single batch) after fixing the initialization and exponentiation of the standard deviation parameters sx, sy. Basically invisible in the top plot is the covariance ellipses at the prediction points. In the bottom plot, each row is an output state, stacked from the one after the first step at the top and showing the progression along the 19 steps after. It shows how the states activate to drive a certain behavior in the output.