Paper Conference

Proceedings of SimBuild Conference 2004: 1st conference of IBPSA-USA


Investigation of Reinforcement Learning for Building Thermal Mass Control

Simeng Liu, Gregor P. Henze
University of Nebraska – Lincoln, Architectural Engineering, 1110 South 67th Street, Peter Kiewit Institute, Omaha, Nebraska 68182-0681 U.S.A

Abstract: This paper describes a simulation-based investigation of machine-learning control for the supervisory control of building thermal mass. Model-free reinforcement learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to precool the building at night before the onset of occupancy based on the feedback it receives from past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and extracts cues about the environment solely based on the reinforcement feedback it receives, which in this study is the monetary cost of each control action. No prediction or system model is required. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. The controller learns to account for the time-dependent cost of electricity, the availability of passive thermal storage inventory, and weather conditions. This study revealed that learning control is a feasible methodology to find a near-optimal setpoint profile for exploiting the passive building thermal storage capacity. The freedom from a building model makes it especially attractive in real-time control problems, and theoretically it can reach the "true" optimum eventually, no matter what building it is dealing with, if only the environment could be sampled for an infinite period of time. The analysis showed that the learning controller is affected by the dimension of the action and state space, the utility rate differentials between on- and off-peak, learning rate and several other factors. Moreover, learning speed is relatively low when dealing with problems with large state space and action space.
Pages: 1 - 11