LLMs

Chat with the Environment: Interactive Multimodal Perception Using Large Language Models

We present Matcha agent, an interactive perception framework that uses LLMs to guide robots in gathering multimodal sensory data (vision, sound, haptics, proprioception) before executing tasks. Matcha enables high-level reasoning and planning in partially observable environments, showing that LLMs can effectively control robot behavior when grounded with multimodal context.

Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models

Lafite-RL is a framework that leverages Large Language Models to provide natural language feedback for guiding reinforcement learning in robotic tasks. Tested on RLBench, it improves learning efficiency and success rates without requiring costly human supervision.

A Closer Look at Reward Decomposition for High-Level Robotic Explanations

Explainable Q-Map improves the transparency of RL agents by combining reward decomposition with abstract action spaces, enabling clear, high-level explanations based on task-relevant object properties. We demonstrate visual and textual explanations in robotic scenarios and show how they can be used with LLMs for reasoning and interactive querying.