Example: confidence

One-Shot Imitation from Observing Humans via …

One-Shot Imitation from Observing Humansvia domain -Adaptive Meta-LearningTianhe Yu*, Chelsea Finn*, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, Sergey LevineUniversity of California, denotes equal contributionAbstract Humans and animals are capable of learning a newbehavior by Observing others perform the skill just once. Weconsider the problem of allowing a robot to do the same learning from a raw video pixels of a human, even when thereis substantial domain shift in the perspective, environment, andembodiment between the robot and the observed human.

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning Tianhe Yu*, Chelsea Finn*, Annie Xie, Sudeep Dasari, …

Tags:

  Domain

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of One-Shot Imitation from Observing Humans via …

1 One-Shot Imitation from Observing Humansvia domain -Adaptive Meta-LearningTianhe Yu*, Chelsea Finn*, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, Sergey LevineUniversity of California, denotes equal contributionAbstract Humans and animals are capable of learning a newbehavior by Observing others perform the skill just once. Weconsider the problem of allowing a robot to do the same learning from a raw video pixels of a human, even when thereis substantial domain shift in the perspective, environment, andembodiment between the robot and the observed human.

2 Priorapproaches to this problem have hand-specified how human androbot actions correspond and often relied on explicit human posedetection systems. In this work, we present an approach for One-Shot learning from a video of a human by using human and robotdemonstration data from a variety of previous tasks to buildup prior knowledge through meta-learning. Then, combiningthis prior knowledge and only a single video demonstrationfrom a human, the robot can perform the task that the humandemonstrated.

3 We show experiments on both a PR2 arm and aSawyer arm, demonstrating that after meta-learning, the robotcan learn to place, push, and pick-and-place new objects usingjust one video of a human performing the INTRODUCTIOND emonstrations provide a descriptive medium for speci-fying robotic tasks. Prior work has shown that robots canacquire a range of complex skills through demonstration,such as table tennis [28], lane following [34], pouring wa-ter [31], drawer opening [38], and multi-stage manipulationtasks [62].

4 However, the most effective methods for robotimitation differ significantly from how Humans and animalsmight imitate behaviors: while robots typically need to receivedemonstrations in the form of kinesthetic teaching [32, 1] orteleoperation [8, 35, 62], Humans and animals can acquire thegist of a behavior simply bywatchingsomeone else. In fact,we can adapt to variations in morphology, context, and taskdetails effortlessly, compensating for whateverdomain shiftmay be present and recovering a skill that we can use in newsituations [6].

5 Additionally, we can do this from a very smallnumber of demonstrations, often only one. How can we endowrobots with the same ability to learn behaviors from raw thirdperson observations of human demonstrators?Acquiring skills from raw camera observations presentstwo major challenges. First, the difference in appearance andmorphology of the human demonstrator from the robot intro-duces a systematic domain shift, namely the correspondenceproblem [29, 6]. Second, learning from raw visual inputstypically requires a substantial amount of data, with moderndeep learning vision systems using hundreds of thousands tomillions of images [57, 18].

6 In this paper, we demonstrate thatwe can begin to address both of these challenges through aFig. meta-learning with human and robot demonstration data, therobot learns to recognize and push a new object from one video of a approach based on meta-learning. Instead of manuallyspecifying the correspondence between human and robot,which can be particularly complex for skills where differentmorphologies require different strategies, we propose a data-driven approach.

7 Our approach can acquire new skills fromonly one video of a human. To enable this, it builds a rich priorover tasks during ameta-trainingphase, where both humandemonstrations and teleoperated demonstrations are availablefor a variety of other, structurally similar tasks. In essence, therobot learns how to learn from Humans using this data. Afterthe meta-training phase, the robot can acquire new skills bycombining its learned prior knowledge with one video of ahuman performing the new main contribution of this paper is a system for learningrobotic manipulation skills from a single video of a human byleveraging large amounts of prior meta-training data, collectedfor different tasks.

8 When deployed, the robot can adapt to aparticular task with novel objects using just a single videoof a human performing the task with those objects ( , seeFigure 1). The video of the human need not be from thesame perspective as the robot, or even be in the same robot is trained using videos of Humans performing taskswith various objects along with demonstrations of the robotperforming the same task. Our experiments on two real roboticplatforms demonstrate the ability to learn directly from RGBvideos of Humans , and to handle novel objects, novel Humans ,and videos of Humans in novel scenes.

9 Videos of the resultscan be found on the supplementary video is available at [ ] 5 Feb 2018II. RELATEDWORKMost Imitation learning and learning from demonstrationmethods operate at the level of configuration-space trajecto-ries [44, 2], which are typically collected using kinestheticteaching [32, 1], teleoperation [8, 35, 62], or sensors on thedemonstrator [11, 9, 7, 21]. Instead, can we allow robotsto imitate just by watching the demonstrator perform thetask?

10 We focus on this problem of learning from one videodemonstration of a human performing a task, in combinationwith human and robot demonstration data collected on othertasks. Prior work has proposed to resolve the correspondenceproblem by hand, for example, by manually specifying howhuman grasp poses correspond to robot grasps [20] or bymanually defining how human activities or commands translateinto robot actions [58, 23, 30, 37, 40]. By utilizing demonstra-tion data of how Humans and robots perform each task, ourapproach learns the correspondence between the human androbot implicitly.


Related search queries