Causal Subgoal Discovery for Hierarchical Reinforcement Learning

Contact: Sadegh Khorasani

In this project, we implement and study Strict Subgoal Execution (SSE) , the hierarchical RL framework introduced in Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning. SSE enforces single-step subgoal reachability by structurally constraining the high-level policy, augments exploration with a decoupled explorer that targets under-visited regions of the goal space, and refines planning with failure-aware edge costs. The goal is to reproduce the core method, adding targeted causal intervention tricks to improve it, and building a clean, modular codebase suitable for future course projects and research extensions.

For evaluation, we will benchmark the reproduced SSE agent on the latent-variable benchmark described in the paper (i.e., the benchmark with latent/hidden factors) to stress-test subgoal reliability and long-horizon planning under partially hidden structure. We will compare against the baselines referenced by the authors (standard GCRL/HRL/HAC methods), report success rate, sample efficiency (area under learning curve, episodes-to-threshold), and path reliability metrics, and run ablations removing (a) decoupled exploration and (b) failure-aware path refinement. Deliverables include a reproducible implementation, experiment scripts/configs for the latent-variable benchmark, and a short report analyzing results and limitations.

  • Learn a structural causal model (SCM) over key state factors from exploration data.
  • Automatically propose subgoals by detecting causal bottlenecks and controllable variables.
  • Train a hierarchical policy (manager/worker) that selects and executes these causal subgoals.
  • Compare against HRL baselines (e.g., HIRO, HAC, Option-Critic) without causal structure. We want to explore HAC transitions

-Khorasani S, Salehkaleybar S, Kiyavash N, Grossglauser M. Hierarchical Reinforcement Learning with Targeted Causal Interventions. -Hwang J, Lee S, Kim J, Han S. Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning. arXiv preprint arXiv:2506.21039. 2025 Jun 26.