Background
Marco did a work with Irtiza on social modeling of pedestrian trajectories in crowds using modified LSTMs in 2017. They were also analyzing a weird performance boost obtained by mistakenly duplicating input data dimensions. It is my belief that the gains were illusory, mostly due to the fact that the Vanilla implementation (the baseline experimental setup) was performing very badly. The code base is a modified version of \cite{vvanirudhsocial-lstm-tf}, which is an unofficial implementation of the reference methodology in \cite{Alahi_2016}: it also turns out in the hidden commentary that it does not perform as well as the official version.
The Vanilla LSTM
Recurrent Neural Networks (RNN in short) are used to capture dynamic temporal behavior in time sequences. Unlike normal networks, there is a recurring connection that feeds the output of a previous step to the current one. They can also have an internal memory state that informs the choice of actions based on the current input.