Saturday, January 11, 2014

Reinforcement learning:
In reinforcement learning a teacher is available, but the teacher instead of directly providing the desired action corresponding to a perception, return reward and punishment to the learner for its action corresponding to a perception.

Examples include a robot in a unknown terrain where its get a punishment when its hits
an obstacle and reward when it moves smoothly.

In order to design a learning system the designer has to make the following choices
based on the application.

Active Reinforcement learning:

Here not only a teacher is available, the learner has the freedom to ask
the teacher for suitable perception-action example pairs which will help the learner to
improve its performance.
Consider a news recommender system which tries to learn an users preferences and categorize news articles as interesting or uninteresting to the user.
The system may present a particular article (of which it is not sure) to the user and ask
whether it is interesting or not.

Passive Reinforcement learning:

By instantiating subsets of the variables, we can break loops in the graph. Unfortunately, when the cutset is large, this is very slow. By instantiating only a subset of values of the cutset, we can compute lower bounds on the probabilities of interest. Alternatively, we can sample the cutsets jointly, a technique known as block Gibbs sampling.

Generalization in Reinforcement Learning
Example to generalize Reinforcement Learning

Training Examples:

D : The set of training examples.
D is a set of pairs { (x,c(x)) }, where c is the target concept. c is a subset of the universe
of discourse or the set of all possible instances.

Example of D:
((red,small,round,humid,low,smooth), poisonous)
((red,small,elongated,humid,low,smooth), poisonous)
((gray,large,elongated,humid,low,rough), not-poisonous)
((red,small,elongated,humid,high,rough), poisonous)

Hypothesis Representation
Any hypothesis h is a function from X to Y h: X   Y
We will explore the space of conjunctions.

Special symbols:
  ? Any value is acceptable
  0 no value is acceptable

Consider the following hypotheses:
(?,?,?,?,?,?): all mushrooms are poisonous
(0,0,0,0,0,0): no mushroom is poisonous

Hypotheses Space:
The space of all hypotheses is represented by H
Let h be a hypothesis in H.
Let X be an example of a mushroom.

if h(X) = 1 then X is poisonous, otherwise X is not-poisonous
Our goal is to find the hypothesis, h*, that is very “close” to target concept c.
A hypothesis is said to “cover” those examples it classifies as positive.


No comments:

Post a Comment