Non-stationary policy learning in 2-player zero sum games

Steven Jensen; Daniel L Boley; Maria L Gini; Paul R Schrater

Non-stationary policy learning in 2-player zero sum games

Steven Jensen, Daniel L Boley, Maria L Gini, Paul R Schrater

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

A key challenge in multiagent environments is the construction of agents that are able to learn while acting in the presence of other agents that are simultaneously learning and adapting. These domains require on-line learning methods without the benefit of repeated training examples, as well as the ability to adapt to the evolving behavior of other agents in the environment. The difficulty is further exacerbated when the agents are in an adversarial relationship, demanding that a robust (i.e. winning) non-stationary policy be rapidly learned and adapted. We propose an on-line sequence learning algorithm, ELPH, based on a straightforward entropy pruning technique that is able to rapidly learn and adapt to non-stationary policies. We demonstrate the performance of this method in a non-stationary learning environment of adversarial zero-sum matrix games.

Original language	English (US)
Title of host publication	Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
Pages	789-794
Number of pages	6
Volume	2
State	Published - Dec 1 2005
Event	20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 - Pittsburgh, PA, United States Duration: Jul 9 2005 → Jul 13 2005

Other

Other	20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
Country/Territory	United States
City	Pittsburgh, PA
Period	7/9/05 → 7/13/05

OpenUrl availability

Full text

Cite this

Non-stationary policy learning in 2-player zero sum games. / Jensen, Steven; Boley, Daniel L ; Gini, Maria L et al.
Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05. Vol. 2 2005. p. 789-794.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Jensen, S, Boley, DL , Gini, ML & Schrater, PR 2005, Non-stationary policy learning in 2-player zero sum games. in Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05. vol. 2, pp. 789-794, 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05, Pittsburgh, PA, United States, 7/9/05.

@inproceedings{01adc2dd4e844a04acec18f2bb1b8f81,

title = "Non-stationary policy learning in 2-player zero sum games",

abstract = "A key challenge in multiagent environments is the construction of agents that are able to learn while acting in the presence of other agents that are simultaneously learning and adapting. These domains require on-line learning methods without the benefit of repeated training examples, as well as the ability to adapt to the evolving behavior of other agents in the environment. The difficulty is further exacerbated when the agents are in an adversarial relationship, demanding that a robust (i.e. winning) non-stationary policy be rapidly learned and adapted. We propose an on-line sequence learning algorithm, ELPH, based on a straightforward entropy pruning technique that is able to rapidly learn and adapt to non-stationary policies. We demonstrate the performance of this method in a non-stationary learning environment of adversarial zero-sum matrix games.",

author = "Steven Jensen and Boley, {Daniel L} and Gini, {Maria L} and Schrater, {Paul R}",

year = "2005",

month = dec,

day = "1",

language = "English (US)",

volume = "2",

pages = "789--794",

booktitle = "Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05",

note = "20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 ; Conference date: 09-07-2005 Through 13-07-2005",

}

TY - GEN

T1 - Non-stationary policy learning in 2-player zero sum games

AU - Jensen, Steven

AU - Boley, Daniel L

AU - Gini, Maria L

AU - Schrater, Paul R

PY - 2005/12/1

Y1 - 2005/12/1

N2 - A key challenge in multiagent environments is the construction of agents that are able to learn while acting in the presence of other agents that are simultaneously learning and adapting. These domains require on-line learning methods without the benefit of repeated training examples, as well as the ability to adapt to the evolving behavior of other agents in the environment. The difficulty is further exacerbated when the agents are in an adversarial relationship, demanding that a robust (i.e. winning) non-stationary policy be rapidly learned and adapted. We propose an on-line sequence learning algorithm, ELPH, based on a straightforward entropy pruning technique that is able to rapidly learn and adapt to non-stationary policies. We demonstrate the performance of this method in a non-stationary learning environment of adversarial zero-sum matrix games.

AB - A key challenge in multiagent environments is the construction of agents that are able to learn while acting in the presence of other agents that are simultaneously learning and adapting. These domains require on-line learning methods without the benefit of repeated training examples, as well as the ability to adapt to the evolving behavior of other agents in the environment. The difficulty is further exacerbated when the agents are in an adversarial relationship, demanding that a robust (i.e. winning) non-stationary policy be rapidly learned and adapted. We propose an on-line sequence learning algorithm, ELPH, based on a straightforward entropy pruning technique that is able to rapidly learn and adapt to non-stationary policies. We demonstrate the performance of this method in a non-stationary learning environment of adversarial zero-sum matrix games.

UR - http://www.scopus.com/inward/record.url?scp=29344453415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29344453415&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:29344453415

VL - 2

SP - 789

EP - 794

BT - Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05

T2 - 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05

Y2 - 9 July 2005 through 13 July 2005

ER -

Non-stationary policy learning in 2-player zero sum games

Abstract

Other

OpenUrl availability

Other files and links

Fingerprint

Cite this