PersRM-R1: Enhance Personalized Reward Modeling with Reinforcement Learning

RL LLMs RM Personalization
Previous