Transcription of Generative Adversarial Imitation Learning
{{id}} {{{paragraph}}}
Generative Adversarial Imitation LearningJonathan ErmonStanford Learning a policy from example expert behavior, without interaction withthe expert or access to a reinforcement signal. One approach is to recover theexpert s cost function with inverse reinforcement Learning , then extract a policyfrom that cost function with reinforcement Learning . This approach is indirectand can be slow. We propose a new general framework for directly extracting apolicy from data as if it were obtained by reinforcement Learning following inversereinforcement Learning . We show that a certain instantiation of our frameworkdraws an analogy between Imitation Learning and Generative Adversarial networks,from which we derive a model-free Imitation Learning algorithm that obtains signif-icant performance gains over existing model-free methods in imitating complexbehaviors in large, high-dimensional Introduct
networks [8], a technique from the deep learning community that has led to recent successes in modeling distributions of natural images: our algorithm harnesses generative adversarial training to fit distributions of states and actions defining expert behavior. We test our algorithm in Section 6, where
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}