Example: bachelor of science

Lecture 14: Reinforcement Learning

Markov Decision Process 19 - Mathematical formulation of the RL problem - Markov property: Current state completely characterises the state of the world Defined by: : set of possible states: set of possible actions: distribution of reward given (state, action) pair: transition probability i.e. distribution over next state given (state, action) pair

Tags:

  Learning, Decision, Reinforcement, Markov, Reinforcement learning, Markov decision

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Related search queries