Wenxuan Zhou

I am a Ph.D. student at the Robotics Institute (RI) at Carnegie Mellon University, advised by Prof. David Held. I am also a Visiting Researcher at FAIR Pittsburgh collaborating with Chris Paxton. My research goal is to equip robots with complex and intelligent behaviors with reinforcement learning.

I interned at DeepMind as a Research Scientist Intern with Team REAL and the robotics team during Summer 2021. I received my master's degree at RI mentored by Lerrel Pinto and advised by Prof. Abhinav Gupta. During my undergraduate study, I worked with Prof. Gabor Orosz on ground robot experiments with connected cruise control. I've also interned at ZF TRW at the brake control systems group.

Contact  /  GitHub  /  Google Scholar

profile photo
Research
clean-usnob Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity
Wenxuan Zhou, David Held
ICRA 2022 Workshop on Reinforcement Learning for Contact-Rich Manipulation

We present a system that applies reinforcement learning to extrinsic dexterity that solves an occluded grasping task with a simple gripper.

#Manipulation #Sim2Real

[Paper] [Website]
clean-usnob Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data
Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess
Conference on Lifelong Learning Agents (CoLLAs) 2022

We identify two challenges in robot lifelong learning with non-stationary dynamics due to off-policy data.

#Lifelong_Learning #Offline_RL #Off_Policy_RL

[Paper] [Website]
clean-usnob Learning Off-Policy with Online Planning
Harshit Sikchi, Wenxuan Zhou, David Held
Conference of Robot Learning 2021 (Oral, Best Paper Finalist)

A novel instantiation of H-step lookahead policies with a learned model and a terminal value from a model-free off-policy algorithm.

#Model_Based_RL #Model_Free_RL

[Paper] [Code] [Website]
clean-usnob PLAS: Latent Action Space for Offline Reinforcement Learning
Wenxuan Zhou, Sujay Bajracharya, David Held
CoRL 2020 (Plenary Talk)

Learning policy in the latent action space to naturally avoid out-of-distribution actions.

#Offline_RL #Off_Policy_RL #Cloth_Manipulation

[Paper] [Code] [Website]
clean-usnob Lyapunov Barrier Policy Optimization
Harshit Sikchi, Wenxuan Zhou, David Held
NeurIPS Deep RL Workshop 2020

Safe reinforcement learning with a Lyapunov-based barrier function.

#Safe_RL

[Paper]
clean-usnob EPI: Environment Probing Interaction Policies
Wenxuan Zhou, Lerrel Pinto, Abhinav Gupta
ICLR 2019

Learning to "probe" the environment before task execution.

#System_Identification #Environment_Generalization #Multi_Task_Learning

[Paper][Code]

Last update: Aug 2022
Source