dc.contributor.author | Widener, Michael D. | |
dc.date.accessioned | 2014-04-15T18:33:21Z | |
dc.date.available | 2014-04-15T18:33:21Z | |
dc.date.issued | 2010-07-01 | |
dc.identifier.uri | https://hdl.handle.net/11244/8264 | |
dc.description.abstract | This paper evaluates a new off policy multiagent reinforcement learning algorithm called Soft Friend or Foe. The new algorithm is the result of modifying the Friend or Foe [1] algorithm by using the correlation in returns between two agents to soften the distinction between friend and foe. The goal is to achieve results similar to the Nash-Q [3] algorithm without the computational complexity and convergence issues. Comparison of three multiagent reinforcement learning algorithms is performed on three simple grid world environments. The algorithms consist of: Michael Littman's Friend or Foe algorithm[1], Soft Friend or Foe, and the Q-Learning algorithm[6] adjusted to a multiagent environment. The Soft Friend or Foe was shown to converge faster than the other two algorithms and get returns equal to or greater than returns received using Q-Learning. Soft Friend or Foe received returns as good as Friend or Foe in all environments. | |
dc.format | application/pdf | |
dc.language | en_US | |
dc.publisher | Oklahoma State University | |
dc.rights | Copyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material. | |
dc.title | Analysis of Soft Friend or Foe Reinforcement Learning Algorithm in Multiagent Environment | |
dc.type | text | |
osu.filename | Widener_okstate_0664M_11024.pdf | |
osu.college | Arts and Sciences | |
osu.accesstype | Open Access | |
dc.description.department | Computer Science Department | |
dc.type.genre | Thesis | |
dc.subject.keywords | artificial | |
dc.subject.keywords | correlate | |
dc.subject.keywords | intelligence | |
dc.subject.keywords | learning | |
dc.subject.keywords | markov | |
dc.subject.keywords | reinforcement | |