Enablingmarkovian representations under imperfect information
File(s)EPOMGs_2021.pdf (170.99 KB)
Accepted version
Author(s)
Belardinelli, Francesco
Leon, Borja G
Malvone, Vadim
Type
Conference Paper
Abstract
Markovian systems are widely used in reinforcement learning (RL), when the successful completion of a task
depends exclusively on the last interaction between an autonomous agent and its environment. Unfortunately,
real-world instructions are typically complex and often better described as non-Markovian. In this paper we
present an extension method that allows solving partially-observable non-Markovian reward decision processes (PONMRDPs) by solving equivalent Markovian models. This potentially facilitates Markovian-based
state-of-the-art techniques, including RL, to find optimal behaviours for problems best described as PONMRDP. We provide formal optimality guarantees of our extension methods together with a counterexample
illustrating that naive extensions from existing techniques in fully-observable environments cannot provide
such guarantees.
depends exclusively on the last interaction between an autonomous agent and its environment. Unfortunately,
real-world instructions are typically complex and often better described as non-Markovian. In this paper we
present an extension method that allows solving partially-observable non-Markovian reward decision processes (PONMRDPs) by solving equivalent Markovian models. This potentially facilitates Markovian-based
state-of-the-art techniques, including RL, to find optimal behaviours for problems best described as PONMRDP. We provide formal optimality guarantees of our extension methods together with a counterexample
illustrating that naive extensions from existing techniques in fully-observable environments cannot provide
such guarantees.
Editor(s)
Rocha, AP
Steels, L
VandenHerik, J
Date Acceptance
2022-02-01
Citation
ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, pp.450-457
ISSN
2184-433X
Publisher
SCITEPRESS
Start Page
450
End Page
457
Journal / Book Title
ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2
Copyright Statement
© 2022 The Author(s). This work is published under CC BY-NC-ND 4.0 International licence.
Identifier
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000774441800041&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=1ba7043ffcc86c417c072aa74d649202
Source
14th International Conference on Agents and Artificial Intelligence (ICAART)
Subjects
Science & Technology
Technology
Computer Science, Artificial Intelligence
Computer Science, Interdisciplinary Applications
Computer Science, Software Engineering
Computer Science, Theory & Methods
Computer Science
Markov Decision Processes
Partial Observability
Extended Partially Observable Decision Process
non-Markovian Rewards
Publication Status
Published
Start Date
2022-02-03
Finish Date
2022-02-05
Coverage Spatial
ELECTR NETWORK