Sven Schmit edited Models.tex  over 9 years ago

Commit id: 35e4a046a88808f27d4c7304f15618bb23126190

deletions | additions      

       

Although the above setup leads to a fully deterministic simulation, we want to attack the problems using reinforcement learning algorithms.  The state space, as is, is infinite, and the physics are complicated.  Therefore, it is important to simplify things instead of Especially with multiple sheep,  modeling the world directly. above as an MDP is problematic, as the new state is difficult to compute given an action.  Learning a dog how Instead, we prefer  to reach a target is trivial, and even moving one sheep to a targes should not be too difficult.   Things get really interested when there are multiple sheep and dogs, and when dogs try to cooperate.  We propose to, at featurize  the very least, try to use two dogs state space  andsee whether they can  work together without communication. Hence, both dogs are controlled by their own AI, with these approximations.  Note the difference between this approach  and they take their own decisions.   However, they methods where features  are aware used to generalize effects  of transitions.  In some way, all we have is  the other dog's position and can use that to take actions. featurized state.