Wenxuan Zhou

I am Research Scientist at Meta GenAI building Large Language Models. I am broadly interested in sequential decision-making problems in AI and robotics.

I obtained Ph.D. in Robotics from Carnegie Mellon University, advised by Prof. David Held. My thesis research was focused on enabling robot to perform complex interactions using reinforcement learning. I have also spent time at Google DeepMind Robotics and FAIR Embodied AI.

GitHub  /  Google Scholar  /  LinkedIn  /  Twitter

profile photo
Research
clean-usnob HACMan: Learning Hybrid Actor-Critic Maps for 6D Non-Prehensile Manipulation
Wenxuan Zhou, Bowen Jiang, Fan Yang, Chris Paxton*, David Held*
Conference of Robot Learning 2023 (Oral)

We propose a spatially-grounded and temporally-abstracted action representation with a hybrid discrete-continuous reinforcement learning framework.

Keywords: RL with 3D Vision, Action Representation, Contact-rich manipulation

[Paper] [Code] [Website]
clean-usnob Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity
Wenxuan Zhou, David Held
Conference of Robot Learning 2022 (Oral)
ICRA 2022 Workshop on Reinforcement Learning for Contact-Rich Manipulation
Press Coverage: IEEE Specturm - Robots Grip Better When They Grip Smarter

We present a system that applies reinforcement learning to extrinsic dexterity that solves an occluded grasping task with a simple gripper.

Keywords: Contact-rich manipulation, Sim2Real

[Paper] [Code] [Website]
clean-usnob Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data
Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess
Conference on Lifelong Learning Agents (CoLLAs) 2022

We identify two challenges in robot lifelong learning with non-stationary dynamics due to off-policy data.

Keywords: Lifelong Learning, Offline RL, Off-Policy RL

[Paper] [Website]
clean-usnob Learning Off-Policy with Online Planning
Harshit Sikchi, Wenxuan Zhou, David Held
Conference of Robot Learning 2021 (Oral, Best Paper Finalist)

A novel instantiation of H-step lookahead policies with a learned model and a terminal value from a model-free off-policy algorithm.

Keywords: Model-Based RL, Model-Free RL

[Paper] [Code] [Website]
clean-usnob PLAS: Latent Action Space for Offline Reinforcement Learning
Wenxuan Zhou, Sujay Bajracharya, David Held
Conference of Robot Learning 2020 (Plenary Talk)

Learning policy in the latent action space to naturally avoid out-of-distribution actions.

Keywords: Offline RL, Off-Policy RL, Deformable Object Manipulation

[Paper] [Code] [Website]
clean-usnob Lyapunov Barrier Policy Optimization
Harshit Sikchi, Wenxuan Zhou, David Held
NeurIPS Deep RL Workshop 2020

Safe reinforcement learning with a Lyapunov-based barrier function.

Keywords: Safe RL

[Paper] [Code]
clean-usnob EPI: Environment Probing Interaction Policies
Wenxuan Zhou, Lerrel Pinto, Abhinav Gupta
ICLR 2019

Learning to "probe" the environment before task execution.

Keywords: System Identification, Multi-Task RL

[Paper][Code]

Last update: Mar 2024
Source