Loading...
Advances in multi-objective reinforcement learning with applications to energy management
Lu, Junlin
Lu, Junlin
Files
Loading...
2025Junlin LuPhD.pdf
Adobe PDF, 13.02 MB
Citations
Altmetric:
Publication Date
2025-09-24
Type
doctoral thesis
Downloads
Citation
Abstract
Multi-Objective Reinforcement Learning (MORL) has become increasingly important for real-world applications such as energy management, where multiple, often conflicting objectives must be optimized simultaneously. However, the deployment of MORL in practical systems faces three key challenges: (1) user preferences over objectives are often unknown or difficult to elicit explicitly; (2) environments are frequently non-stationary, requiring agents to adapt rapidly to changing conditions; and (3) exploration in sparse or high-dimensional objective spaces remains inefficient.
This thesis addresses these challenges through three major contributions. First, a Dynamic-Weight Preference Inference (DWPI) algorithm is proposed, which infers user preferences from expert demonstrations without requiring interactive feedback. By framing preference inference as a supervised learning problem, DWPI enables efficient and robust preference estimation, even in the presence of suboptimal demonstrations. Second, to handle environmental non-stationarity, a meta-learning extension to the Generalized Policy Improvement framework (R-GPI-LS/PD) is introduced. Leveraging Reptile meta-learning, this method allows MORL agents to adapt to new environmental dynamics with minimal retraining, significantly improving sample efficiency. Third, a Demonstration-Guided MORL (DG-MORL) framework is developed to enhance exploration efficiency. By incorporating and progressively refining demonstrations, DG-MORL facilitates effective exploration in sparse-reward and non-ergodic environments.
The proposed methods are evaluated on standard MORL benchmarks and realistic energy management environments, demonstrating substantial improvements in preference inference accuracy, adaptation speed, and exploration efficiency compared to existing baselines. This thesis thus contributes to making MORL more practical, adaptable, and user-aligned for complex real-world applications, particularly in energy management systems.
Publisher
University of Galway
Publisher DOI
Rights
CC BY-NC-ND