On model-based online inverse reinforcement learning
Abstract
Based on the premise that the most succinct representation of the behavior of an entity is its reward structure, inverse reinforcement learning aims to recover the reward (or cost) function by observing an agent perform a task and monitoring state and control trajectories of the observed agent. In general, it has been shown that it is easier to show how to perform a task rather than to describe how to perform the task. Autonomous agents can use this same ideology to develop a mathematical representation, called a reward function, which inherently describes the overall task objective. Inverse reinforcement learning (IRL) is a process in which machines learn to perform complex tasks through analyzing state and control trajectories. Most research that has been done on IRL has been offline, which only allows for repetitive tasks and unchanging environments. The development of real-time IRL techniques, by allowing the autonomous agent to update its reward function in time, would help autonomous entities adapt to changes in the environment by correcting previously inaccurate information, and allow for a more dynamic response to unforeseen alterations in task objectives. In this dissertation, data-driven model-based inverse reinforcement learning techniques are developed that facilitate reward function estimation in real-time. The dissertation then builds off that foundation to explore techniques to resolve sub-optimal trajectories, data sparsity, and partial/imperfect measurements, which are inherent challenges to IRL. An application section is then discussed, including a novel pilot behavior modeling approach.
Collections
- OSU Dissertations [11222]