Adaptive reinforcement learning: From time series anomaly detection to continual learning
Yang, Xue
Yang, Xue
Loading...
Files
Loading...
2025xueyangphd.pdf
Adobe PDF, 52.9 MB
Publication Date
2025-12-12
Type
doctoral thesis
Downloads
Citation
Abstract
The complexity and scale of IT systems are increasing dramatically, posing many challenges to real-world time series anomaly detection. Many existing methods emphasize feature learning and anomaly scoring but give limited attention to thresholding. Static or manually defined thresholds are commonly used, but they lack adaptability to complex data. Reinforcement learning (RL) is well known for providing optimal control in complex, dynamic systems. This thesis investigates its application for dynamic thresholding control in anomaly detection, and validates the proposed methods in real-world cyber-physical systems (CPS).
First, an agent-based dynamic thresholding (ADT) method is proposed, which models thresholding in time series anomaly detection as a Markov decision process (MDP). ADT employs a deep Q-network and integrates it with an autoencoder-based anomaly scorer to provide dynamic thresholds adaptively. ADT consistently outperforms traditional static and statistical thresholding methods, achieving significantly improved detection performance and robustness even when trained with limited data.
Next, ADT is extended and evaluated in real-world CPS with additional challenges. Experimental results show that ADT significantly outperforms baseline anomaly detection methods, maintaining superior anomaly detection performance, dynamic thresholding capability, data efficiency, and fast training. It shows strong robustness even when the environmental feedback is noisy, partial, or delayed. Moreover, when integrated as a dynamic thresholding controller into existing anomaly detection methods, ADT significantly enhances their detection performance.
Traditional RL methods typically assume a static MDP that rarely holds in practice. This thesis further investigates the continual reinforcement learning (CRL) problem, where the RL agents continually learn and adapt to evolving tasks without catastrophic forgetting. Unlike existing methods that apply prior knowledge mainly through optimization with limited direct guidance on the agent’s behavior, this thesis introduces a novel demonstration-guided continual reinforcement learning (DGCRL) framework, which stores prior knowledge in an external, self-evolving demonstration repository and uses selected demonstrations to directly guide the agent’s exploration and adaptation across tasks, following a dynamic curriculum-based strategy. DGCRL demonstrates superior average performance, knowledge transfer, mitigation of forgetting, and learning efficiency.
Funder
Publisher
University of Galway
Publisher DOI
Rights
CC BY-NC-ND