this is for holding javascript data
Sven Schmit edited Models.tex
over 9 years ago
Commit id: 35e4a046a88808f27d4c7304f15618bb23126190
deletions | additions
diff --git a/Models.tex b/Models.tex
index 90b922b..b0813d0 100644
--- a/Models.tex
+++ b/Models.tex
...
Although the above setup leads to a fully deterministic simulation, we want to attack the problems using reinforcement learning algorithms.
The state space, as is, is infinite, and the physics are complicated.
Therefore, it is important to simplify things instead of Especially with multiple sheep, modeling the
world directly. above as an MDP is problematic, as the new state is difficult to compute given an action.
Learning a dog how Instead, we prefer to
reach a target is trivial, and even moving one sheep to a targes should not be too difficult.
Things get really interested when there are multiple sheep and dogs, and when dogs try to cooperate.
We propose to, at featurize the
very least, try to use two dogs state space and
see whether they can work
together without communication. Hence, both dogs are controlled by their own AI, with these approximations.
Note the difference between this approach and
they take their own decisions.
However, they methods where features are
aware used to generalize effects of
transitions.
In some way, all we have is the
other dog's position and can use that to take actions. featurized state.