RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks
Mian Wu, Gavin Zhang, Sewon Min, Sergey Levine, Aviral Kumar
arXiv preprint, 2025
We propose a post-training methodology for improving language models on open-ended generation tasks. Instead of relying on static reward models, RLAC trains a dynamic LLM critic alongside the generator using adversarial gameplay. The critic identifies the most likely failure modes which are then validated externally, adapting as the generator improves. We demonstrate improvements in factual accuracy for text generation and correctness for code generation across multiple benchmarks.