Bachelor's thesis Johannes Kern


Indoor Comfort and Energy Optimization in Smart Buildings with Deep Reinforcement Learning

Reinforcement Learning Copyright: EBC Reinforcement Learning

The automated evaluation of data from complex energy systems in buildings is becoming increasingly important as a result of the energy revolution in order to ensure efficiency in this sector. In this Bachelor thesis, a reinforcement learning algorithm is presented, which can control a ventilation system in a room by means of data records. In addition to interior comfort, the energy consumption of the ventilation system is also taken into account. The aim of the work is to develop evaluation functions for the interior comfort and the energy consumption of the ventilation system for such an algorithm and to draw conclusions about the behaviour of such an algorithm on the basis of time series data. For the evaluation of the interior comfort both the thermal interior comfort with the room air temperature and the relative air humidity are considered as well as the room air quality with the CO2 and the VOC (Volatile Organic Compounds) concentration. The energy consumption of the ventilation system is approximated by the supply air volume flow and the energy consumption of the heating and cooling system is evaluated on the basis of its partial load requirements and the cooling water outlet temperature of the cooling system. As reinforcement learning algorithm the Deep Deterministic Policy Gradient (DDPG) is used, which considers all recorded data points separately. For this reason, a reward function is developed for each parameter considered, whereby the condition in a room can be evaluated. The Reward functions determined are tested in a meeting room in an office building complex in Munich. Examples include a particularly hot summer week from 22nd of July to 28th of July 2019 and a cold winter week from 26th of November to 3rd of December 2018. From the course of the rewards awarded for the conditions in this week, conclusions can be drawn as to how a reinforcement learning algorithm could learn and regulate. It is noticeable that the course of the reward functions in the winter week is clearly more predictable than that of the summer week, which will be mainly due to incorrect user behaviour such as opening windows. In the long term, the reinforcement learning algorithm should be adapted and extended so that it can be used for an entire building in addition to individual room control.