Date
Journal Title
Journal ISSN
Volume Title
Publisher
Recent advancements in model-free control strategies, such as reinforcement learning (RL) have led to more practical and scalable solutions for building energy system controls These strategies do not require complex models of building dynamics and rely exclusively on data to learn the control policy. Applications of these techniques in heating, ventilation and air-conditioning (HVAC) systems are being studied under different operational scenarios, including demand response programs. Conventional (unconstrained) reinforcement learning controllers often address indoor comfort constraints by incorporating a comfort violation penalty in the reward function. While this approach can result in good performance in terms of energy cost, it often leads to significant constraint violations when a small penalty factor is used. On the other hand, effective enforcement of constraints can be achieved, but at the cost of economic performance degradation. Hence, a clear trade-off between economic performance and constraint satisfaction poses a challenge to overcome. Motivated by this challenge, this thesis presents a constrained RL-based control strategy for building demand response. The proposed strategy handles the constraints explicitly, avoiding the use of arbitrarily set penalty factors that can significantly impact control performance. To demonstrate its efficacy, simulation tests of the proposed strategy, as well as baseline model predictive controllers (MPC) and conventional (unconstrained) policy optimization methods, were conducted. The simulation tests showed that the constrained RL strategy achieved utility cost savings up to 16.1%, similar to the MPC baselines, without requiring any model of the building and with minimum constraint violation. In contrast, the unconstrained RL controllers led to either high utility costs or constraint violations, depending on the penalty factor setting.