MDP Definition
Terminology and Notation
Let's begin with Markov Decision Process (MDP).
Definitions
Fully Observed
Partially Observed
Last updated
Let's begin with Markov Decision Process (MDP).
Last updated
We choose action at time when we saw observation , and the latent state transfered to , and we got reward from the environment.
Markov decision process
: state space; states (discrete or continuous)
: action space; actions (discrete or continuous)
: transition operator, a tensor
: reward function;
partially observed Markov decision process
: state space; states (discrete or continuous)
: action space; actions (discrete or continuous)
: observation space; observations (discrete or continuous)
: transition operator, a tensor
: emission probability
: reward function;