Mian Wu

Mian Wu

Modeling the World
Hi 👋
I'm a senior undergraduate student majoring in Electrical and Computer Engineering at Shanghai Jiao Tong University. I am currently a visiting student and a member of the Robotic Artificial Intelligence and Learning Lab at UC Berkeley, advised by Sergey Levine.
My recent research centers on scalable reinforcement learning with large language models and vision-language models. I am particularly interested in developing RL methods that improve the reasoning capabilities of agents and generalize to more complex free-form generation tasks.

Research

( / )
RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks
Mian Wu, Gavin Zhang, Sewon Min, Sergey Levine, Aviral Kumar
arXiv preprint, 2025
We propose a post-training methodology for improving language models on open-ended generation tasks. Instead of relying on static reward models, RLAC trains a dynamic LLM critic alongside the generator using adversarial gameplay. The critic identifies the most likely failure modes which are then validated externally, adapting as the generator improves. We demonstrate improvements in factual accuracy for text generation and correctness for code generation across multiple benchmarks.

Research Interests

Machine Learning, Natural Language Processing, Computer Vision, AI Safety, Reinforcement Learning

Education

B.S. in Electrical and Computer Engineering, Shanghai Jiao Tong University
2021 - 2024, 2025 - 2026

Art

I'm also a drummer who enjoys progressive, metal, and emo rock. I founded the band QaQqvq0.0.
Band photo Album cover
Loading...
...
...
...