Assignment 6: Reinforcement Learning

Spring 2004

CMPT 882

Instructor: Oliver Schulte

Due: March 25, 2004

Topic: Reinforcement Learning.

Paper and Pencil Exercise: In Mitchell, do Ex. 13.2 (a) and (c) (not b).

Reinforcement Learning Implementations

I haven't had much luck finding reinforcement learning software, especially Java demos. The University of Massachussetts has a page with various RL software. The pole-balancing simulator should be nice (C code available from this page). Two Java applets worth checking out the following.

A robotics problem, where a robot has to learn to coordinate its "limbs" so that its body moves forward. You should the documentation too, which relates the general concepts of reinforcement learning to this particular task.
Check out the program that learns to play blackjack. A great thing about this demo is that you can play yourself with the program. Playing a few rounds of blackjack is fun - James Bond does it in every movie - and you can watch the program learn (does it improve? How quickly?).

There are a number of demos that you might want to take a look at, though they don't allow much experimentation and it's pretty hard to see what is going on. Still, they'll give you a sense of the kind of tasks that people currently use RL for. Tor's Bachelor's Thesis has software that learns to control an inverted pendulum. The graphic state applet shows bugs trying to gather food and bring it home. Finally we have the successful program that learns to allocate channels to cell phones - I don't find the interface very clear, though, but it's a great task.

So in all for the assignment, you should run a reinforcement learner on 2 tasks. Try to get some sense of how the learning works, i.e., what the states, actions and rewards are.