Publications

Agentic Skill Discovery

We propose an LLM-driven framework that enables robots to autonomously discover useful skills from scratch. By generating tasks, …

Xufeng Zhao, Cornelius Weber, Stefan Wermter

Joint Design of Protein Surface and Backbone Using a Diffusion Bridge Model

PepBridge jointly designs receptor-complementary protein surfaces and full 3D structures from a receptor’s point-cloud surface. It uses …

Guanlve Li, Xufeng Zhao, Fang Wu, Sören Laue

PersRM-R1: Enhance Personalized Reward Modeling with Reinforcement Learning

We propose PersRM-R1, a reasoning-based reward model that learns personal preferences from just a few examples. Using synthetic data …

Mengdi Li*, Guanqiao Chen*, Xufeng Zhao, Haochen Wen, Shu Yang, Di Wang

REAL: Response Embedding-Based Alignment for LLMs

We propose REAL (Response Embedding-based Alignment for LLMs), a method to improve alignment efficiency by selecting less ambiguous, …

Honggen Zhang, Xufeng Zhao, Igor Molybog, June Zhang

Mental Modeling of Reinforcement Learning Agents by Language Models

This study explores whether LLMs can mentally model decision-making agents by reasoning over their behavior and state transitions from …

Wenhao Lu, Xufeng Zhao, Josua Spisak, Jae Hee Lee, Stefan Wermter

Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback

We propose Curriculum-RLAIF, a data-centric framework that improves reward model generalizability by training on preference pairs of …

Mengdi Li*, Jiaye Lin*, Xufeng Zhao, Wenhao Lu, Peilin Zhao, Stefan Wermter, Di Wang

LLM+MAP: Bimanual Robot Task Planning Using Large Language Models and Planning Domain Definition Language

LLM+MAP is a bimanual planning framework that combines GPT-4o with multi-agent task planning to enable efficient and logically …

Kun Chu, Xufeng Zhao, Cornelius Weber, Stefan Wermter

Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic

We propose LoT (Logical Thoughts), a framework that improves large language models’ reasoning at inference time by applying symbolic …

Xufeng Zhao, Mengdi Li, Wenhao Lu, Cornelius Weber, Jae Hee Lee, Kun Chu, Stefan Wermter

Large Language Models for Orchestrating Bimanual Robots

LABOR uses LLMs to orchestrate control policies for long-horizon bimanual manipulation tasks. By leveraging task reasoning and …

Kun Chu, Xufeng Zhao, Cornelius Weber, Mengdi Li, Wenhao Lu, Stefan Wermter

Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning

We introduce OSSA (Object State-Sensitive Agent), a task-planning agent using pre-trained LLMs and VLMs to generate plans sensitive to …

Xiaowen Sun, Xufeng Zhao, Jae Hee Lee, Wenhao Lu, Matthias Kerzel, Stefan Wermter

Causal State Distillation for Explainable Reinforcement Learning

We propose reward decomposition methods for better decision-making explainality.

Wenhao Lu, Xufeng Zhao, Thilo Fryen, Jae Hee Lee, Mengdi Li, Sven Magg, Stefan Wermter

Chat with the Environment: Interactive Multimodal Perception Using Large Language Models

We present Matcha agent, an interactive perception framework that uses LLMs to guide robots in gathering multimodal sensory data …

Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter

Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models

Lafite-RL is a framework that leverages Large Language Models to provide natural language feedback for guiding reinforcement learning …

Kun Chu, Xufeng Zhao, Cornelius Weber, Mengdi Li, Stefan Wermter

Internally Rewarded Reinforcement Learning

We introduce Internally Rewarded Reinforcement Learning (IRRL), where rewards are generated by a jointly learned internal model rather …

Xufeng Zhao*, Mengdi Li*, Jae Hee Lee, Cornelius Weber, Stefan Wermter

A Closer Look at Reward Decomposition for High-Level Robotic Explanations

Explainable Q-Map improves the transparency of RL agents by combining reward decomposition with abstract action spaces, enabling clear, …

Wenhao Lu, Xufeng Zhao, Sven Magg, Martin Gromniak, Mengdi Li, Stefan Wermter

Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations

We propose the Intrinsic Sound Curiosity Module (ISCM) to use sound as an informative modality for unsupervised reinforcement learning. …

Xufeng Zhao, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter

Density Weighted Diversity based Query Strategy for Active Learning

DWDS is a density-weighted diversity strategy for active learning in deep learning. It selects informative and representative samples …

Tingting Wang, Xufeng Zhao, Qiujian Lv, Bo Hu, Degang Sun