Xufeng Zhao
Home
Publications
Posts
Experience
Awards & Grants
Contact
CV
Publications
Type
Conference paper
Journal article
Preprint
Date
2025
2024
2023
2022
2021
Agentic Skill Discovery
We propose an LLM-driven framework that enables
robots to autonomously discover useful skills from scratch
. By generating tasks, …
Xufeng Zhao
,
Cornelius Weber
,
Stefan Wermter
Joint Design of Protein Surface and Backbone Using a Diffusion Bridge Model
PepBridge jointly designs receptor-complementary protein surfaces and full 3D structures from a receptor’s point-cloud surface. It uses …
Guanlve Li
,
Xufeng Zhao
,
Fang Wu
,
Sören Laue
PersRM-R1: Enhance Personalized Reward Modeling with Reinforcement Learning
We propose PersRM-R1, a reasoning-based reward model that learns personal preferences from just a few examples. Using synthetic data …
Mengdi Li*
,
Guanqiao Chen*
,
Xufeng Zhao
,
Haochen Wen
,
Shu Yang
,
Di Wang
REAL: Response Embedding-Based Alignment for LLMs
We propose REAL (Response Embedding-based Alignment for LLMs), a method to improve alignment efficiency by selecting less ambiguous, …
Honggen Zhang
,
Xufeng Zhao
,
Igor Molybog
,
June Zhang
Mental Modeling of Reinforcement Learning Agents by Language Models
This study explores whether LLMs can mentally model decision-making agents by reasoning over their behavior and state transitions from …
Wenhao Lu
,
Xufeng Zhao
,
Josua Spisak
,
Jae Hee Lee
,
Stefan Wermter
Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback
We propose Curriculum-RLAIF, a data-centric framework that improves reward model generalizability by training on preference pairs of …
Mengdi Li*
,
Jiaye Lin*
,
Xufeng Zhao
,
Wenhao Lu
,
Peilin Zhao
,
Stefan Wermter
,
Di Wang
LLM+MAP: Bimanual Robot Task Planning Using Large Language Models and Planning Domain Definition Language
LLM+MAP is a bimanual planning framework that combines GPT-4o with multi-agent task planning to enable efficient and logically …
Kun Chu
,
Xufeng Zhao
,
Cornelius Weber
,
Stefan Wermter
Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic
We propose LoT (Logical Thoughts), a framework that improves large language models’ reasoning at inference time by applying symbolic …
Xufeng Zhao
,
Mengdi Li
,
Wenhao Lu
,
Cornelius Weber
,
Jae Hee Lee
,
Kun Chu
,
Stefan Wermter
Large Language Models for Orchestrating Bimanual Robots
LABOR uses LLMs to orchestrate control policies for long-horizon bimanual manipulation tasks. By leveraging task reasoning and …
Kun Chu
,
Xufeng Zhao
,
Cornelius Weber
,
Mengdi Li
,
Wenhao Lu
,
Stefan Wermter
Details Make a Difference: Object State-Sensitive Neurorobotic Task Planning
We introduce OSSA (Object State-Sensitive Agent), a task-planning agent using pre-trained LLMs and VLMs to generate plans sensitive to …
Xiaowen Sun
,
Xufeng Zhao
,
Jae Hee Lee
,
Wenhao Lu
,
Matthias Kerzel
,
Stefan Wermter
Causal State Distillation for Explainable Reinforcement Learning
We propose reward decomposition methods for better decision-making explainality.
Wenhao Lu
,
Xufeng Zhao
,
Thilo Fryen
,
Jae Hee Lee
,
Mengdi Li
,
Sven Magg
,
Stefan Wermter
Chat with the Environment: Interactive Multimodal Perception Using Large Language Models
We present Matcha agent, an interactive perception framework that uses LLMs to guide robots in gathering multimodal sensory data …
Xufeng Zhao
,
Mengdi Li
,
Cornelius Weber
,
Muhammad Burhan Hafez
,
Stefan Wermter
Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models
Lafite-RL is a framework that leverages Large Language Models to provide natural language feedback for guiding reinforcement learning …
Kun Chu
,
Xufeng Zhao
,
Cornelius Weber
,
Mengdi Li
,
Stefan Wermter
Internally Rewarded Reinforcement Learning
We introduce Internally Rewarded Reinforcement Learning (IRRL), where rewards are generated by a jointly learned internal model rather …
Xufeng Zhao*
,
Mengdi Li*
,
Jae Hee Lee
,
Cornelius Weber
,
Stefan Wermter
A Closer Look at Reward Decomposition for High-Level Robotic Explanations
Explainable Q-Map improves the transparency of RL agents by combining reward decomposition with abstract action spaces, enabling clear, …
Wenhao Lu
,
Xufeng Zhao
,
Sven Magg
,
Martin Gromniak
,
Mengdi Li
,
Stefan Wermter
Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations
We propose the Intrinsic Sound Curiosity Module (ISCM) to use sound as an informative modality for unsupervised reinforcement learning. …
Xufeng Zhao
,
Cornelius Weber
,
Muhammad Burhan Hafez
,
Stefan Wermter
Density Weighted Diversity based Query Strategy for Active Learning
DWDS is a density-weighted diversity strategy for active learning in deep learning. It selects informative and representative samples …
Tingting Wang
,
Xufeng Zhao
,
Qiujian Lv
,
Bo Hu
,
Degang Sun
Cite
×