Work Plan
Running reinforcement learning experiments on medium-sized environments such as ALE is known to be time-consuming because of the high sample complexity, high experimental variance and difficulty of parallelization. Therefore the plan accommodates for the development and testing of efficient, distributed code for experimentation.
First semester:
- Finish work-in-progress on the code-base required for this research and generate baselines on a variety of environments.
- Review the literature on deep generative models.
- Review the literature on model-based reinforcement learning.
Second semester:
- perform the first experiments on learning action-conditional models offline, using for example data gathered by a fully trained agent.
- evaluate the adequacy of the various generative models to the reinforcement learning setting informed by these experiments.