Full Program »
Recurrent Soft Actor Critic Reinforcement Learning For Demand Response Problems
Demand response problems are typically solved with rule-based or model predictive control solutions. While rule-based controls do not take into account the future development of uncertain variables, model predictive control solutions often require very accurate predictions. To solve demand response problems, which incorporate the inaccuracies of predictions in their decision making, deep reinforcement learning methods have become popular. For their implementation, current literature defines the demand response problem as a fully observable Markov decision process. However, the assumption of full observability is usually not satisfied in reality. An alternative idea is to describe the problem as partially observable and to use recurrency in the policy function. In this paper, we adapt this idea and propose a novel deep reinforcement learning control solution for demand response problems, based on the soft actor critic framework. Controlling a heat pump with a mixture of discrete and continuous action capabilities, we show that significant performance improvement can be achieved by using recurrency in the policy compared to a non recurrent policy function.