09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Reinforcement Learning

So, the first question that arises is what is RL and how is it different from supervised

and unsupervised learning? Anyone who owns a pet knows that the best strategy to

train a pet is rewarding it for desirable behavior and punishing it for bad behavior.

RL, also called learning with a critic, is a learning paradigm where the agent

learns in the same manner. The agent here corresponds to our network (program);

it can perform a set of Actions (a), which brings about a change in the State (s)

of the environment and, in turn, the agent receives a reward or punishment from

the environment.

For example, consider the case of training a dog to fetch the ball: here, the dog is

our agent, the voluntary muscle movements that the dog makes are the actions,

and the ground (including person and ball) is the environment; the dog perceives

our reaction to its action in terms of giving it a bone as a reward. RL can be defined

as a computational approach to goal-directed learning and decision making, from

interaction with the environment, under some idealized conditions. The Agent can

sense the state of the Environment, and the Agent can perform specific well-defined

actions on the Environment. This causes two things: first, a change in the state of the

environment, and second, a reward is generated (under ideal conditions). This cycle

continues, and in theory the agent learns how to more frequently generate a reward

over time:

Unlike supervised learning, the Agent is not presented with any training examples; it

does not know what the correct action is.

[ 408 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!