Contact: Sadegh Khorasani
In this project, we implement and study Strict Subgoal Execution (SSE) , the hierarchical RL framework introduced in Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning. SSE enforces single-step subgoal reachability by structurally constraining the high-level policy, augments exploration with a decoupled explorer that targets under-visited regions of the goal space, and refines planning with failure-aware edge costs. The goal is to reproduce the core method, adding targeted causal intervention tricks to improve it, and building a clean, modular codebase suitable for future course projects and research extensions.
For evaluation, we will benchmark the reproduced SSE agent on the latent-variable benchmark described in the paper (i.e., the benchmark with latent/hidden factors) to stress-test subgoal reliability and long-horizon planning under partially hidden structure. We will compare against the baselines referenced by the authors (standard GCRL/HRL/HAC methods), report success rate, sample efficiency (area under learning curve, episodes-to-threshold), and path reliability metrics, and run ablations removing (a) decoupled exploration and (b) failure-aware path refinement. Deliverables include a reproducible implementation, experiment scripts/configs for the latent-variable benchmark, and a short report analyzing results and limitations.
-Khorasani S, Salehkaleybar S, Kiyavash N, Grossglauser M. Hierarchical Reinforcement Learning with Targeted Causal Interventions. -Hwang J, Lee S, Kim J, Han S. Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning. arXiv preprint arXiv:2506.21039. 2025 Jun 26.