Publications

We propose an LLM-driven framework that enables robots to autonomously discover useful skills from scratch. By generating tasks, …

PepBridge jointly designs receptor-complementary protein surfaces and full 3D structures from a receptor’s point-cloud surface. It uses …

We propose PersRM-R1, a reasoning-based reward model that learns personal preferences from just a few examples. Using synthetic data …

We propose REAL (Response Embedding-based Alignment for LLMs), a method to improve alignment efficiency by selecting less ambiguous, …

This study explores whether LLMs can mentally model decision-making agents by reasoning over their behavior and state transitions from …

We propose Curriculum-RLAIF, a data-centric framework that improves reward model generalizability by training on preference pairs of …

LLM+MAP is a bimanual planning framework that combines GPT-4o with multi-agent task planning to enable efficient and logically …

We propose LoT (Logical Thoughts), a framework that improves large language models’ reasoning at inference time by applying symbolic …

LABOR uses LLMs to orchestrate control policies for long-horizon bimanual manipulation tasks. By leveraging task reasoning and …

We introduce OSSA (Object State-Sensitive Agent), a task-planning agent using pre-trained LLMs and VLMs to generate plans sensitive to …

We propose reward decomposition methods for better decision-making explainality.

We present Matcha agent, an interactive perception framework that uses LLMs to guide robots in gathering multimodal sensory data …

Lafite-RL is a framework that leverages Large Language Models to provide natural language feedback for guiding reinforcement learning …

We introduce Internally Rewarded Reinforcement Learning (IRRL), where rewards are generated by a jointly learned internal model rather …

Explainable Q-Map improves the transparency of RL agents by combining reward decomposition with abstract action spaces, enabling clear, …

We propose the Intrinsic Sound Curiosity Module (ISCM) to use sound as an informative modality for unsupervised reinforcement learning. …

DWDS is a density-weighted diversity strategy for active learning in deep learning. It selects informative and representative samples …