PDF4PRO ⚡AMP

Modern search engine that looking for books and documents around the web

Example: dental hygienist

Abstract - arXiv

Offline reinforcement learning as One BigSequence Modeling ProblemMichael JannerQiyang LiSergey LevineUniversity of California at Berkeley{janner, learning (RL) is typically concerned with estimating stationarypolicies or single-step models, leveraging the Markov property to factorize prob-lems in time. However, we can also view RL as a generic sequence modelingproblem, with the goal being to produce a sequence of actions that leads to asequence of high rewards. Viewed in this way, it is tempting to consider whetherhigh-capacity sequence prediction models that work well in other domains, suchas natural-language processing, can also provide effective solutions to the RLproblem.}

learning, goal-conditioned RL, and offline RL. Further, we show that this approach can be combined with existing model-free algorithms to yield a state-of-the-art planner in sparse-reward, long-horizon tasks. 1 Introduction The standard treatment of reinforcement learning relies on decomposing a long-horizon problem into smaller, more local ...

Loading..

Tags:

  Introduction, Learning, Reinforcement, Reinforcement learning

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Spam in document Broken preview Other abuse

Transcription of Abstract - arXiv

Related search queries