Artificial Intelligence

Reinforcement Learning 101 – A two minute read

Introduction

Reinforcement Learning (RL) is a sub-field of artificial intelligence (AI) in which an agent takes actions based on the numeric reward it obtains from its enviroment. As an example, let us assume that an agent is  learning to play chess. For the agent to be able to learn how to play, each time it makes a move, it requires some feedback on whether the move is bad one or a good one.

An example of a good move could be when the agent is able to remove one of the pieces of its opponents without losing any of its pieces. At the highest level, this kind of feedback is called reinforcement and hence the name reinforcement learning (because the agent learns based on such reinforcements).

In reinforcement learning, the agent has a state space that it explores, there are a set of actions that the agent can take in any particular state, and there is a reward associated with the next state the agent lands into. The goal of reinforcement learning is to design the optimal or near-optimal policy based on the rewards received. Figure 1 provides an overview of reinforcement learning.

Reinforcement Learning
Figure 1: Overview of Reinforcement Learning

Types of Reinforcement Learning

At the highest level, there are two types of reinforcement learning – 1) passive learning and 2) active learning.

In passive learning, the policy taken in each state is fixed and the goal of the agent is to determine the utilities of being in the different states.

Active learning is the more challenging approach where the agent must learn what to do. As the agent is learning the best policy to adopt, it must explore the environment in which it is in.