XAI

Mental Modeling of Reinforcement Learning Agents by Language Models

This study explores whether LLMs can mentally model decision-making agents by reasoning over their behavior and state transitions from interaction histories. Evaluated on reinforcement learning tasks, results show that while LLMs offer some insight, they fall short of fully modeling agents without further innovation, highlighting both their potential and current limitations for explainable RL.

Causal State Distillation for Explainable Reinforcement Learning

We propose reward decomposition methods for better decision-making explainality.

A Closer Look at Reward Decomposition for High-Level Robotic Explanations

Explainable Q-Map improves the transparency of RL agents by combining reward decomposition with abstract action spaces, enabling clear, high-level explanations based on task-relevant object properties. We demonstrate visual and textual explanations in robotic scenarios and show how they can be used with LLMs for reasoning and interactive querying.