A kernel-based approach to non-stationary reinforcement learning in metric spaces.

Authors

DOMINGUES Omar
MENARD Pierre
PIROTTA Matteo
KAUFMANN Emilie
VALKO Michal

Publication date

2021

Publication type

Proceedings Article

Summary In this work, we propose KeRNS: an algorithm for episodic reinforcement learning in nonstationary Markov Decision Processes (MDPs) whose state-action set is endowed with a metric. Using a non-parametric model of the MDP built with time-dependent kernels, we prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time, which quantifies its level of non-stationarity. Our method generalizes previous approaches based on sliding windows and exponential discounting used to handle changing environments. We further propose a practical implementation of KeRNS, we analyze its regret and validate it experimentally.

See the publication

Topics of the publication

Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr