Within the field of reinforcement learning with deep neural network estimators, this research proposal is concerned with investigating and then augmenting or replacing the Experience Replay mechanism found in most of the state-of-the art methods with shallow and flexibleaction-conditional models of the environment working not in the pixel space but in at the level of low-dimensional intermediate representations. I plan approaching this task informed by the recent results in conditional generative methods.

Introduction