Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression

Richard Maclin, Jude Shavlik, Lisa Torrey, Trevor Walker, Edward Wild

Research output: Chapter in Book/Report/Conference proceedingConference contribution

55 Scopus citations

Abstract

We present a novel formulation for providing advice to a reinforcement learner that employs support-vector regression as its function approximator. Our new method extends a recent advice-giving technique, called Knowledge-Based Kernel Regression (KBKR), that accepts advice concerning a single action of a reinforcement learner. In KBKR, users can say that in some set of states, an action's value should be greater than some linear expression of the current state. In our new technique, which we call Preference KBKR (Pref-KBKR), the user can provide advice in a more natural manner by recommending that some action is preferred over another in the specified set of states. Specifying preferences essentially means that users are giving advice about policies rather than Q values, which is a more natural way for humans to present advice. We present the motivation for preference advice and a proof of the correctness of our extension to KBKR. In addition, we show empirical results that our method can make effective use of advice on a novel reinforcement-learning task, based on the RoboCup simulator, which we call Breakaway. Our work demonstrates the significant potential of advice-giving techniques for addressing complex reinforcement learning problems, while further demonstrating the use of support-vector regression for reinforcement learning.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
Pages819-824
Number of pages6
Volume2
StatePublished - Dec 1 2005
Event20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 - Pittsburgh, PA, United States
Duration: Jul 9 2005Jul 13 2005

Other

Other20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
CountryUnited States
CityPittsburgh, PA
Period7/9/057/13/05

Fingerprint Dive into the research topics of 'Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression'. Together they form a unique fingerprint.

Cite this