Hi! I am a postdoc at UC Berkeley, where I work with Pieter Abbeel on developing intelligent agents
that can interact with the physical
world.
Previously, I was a Research Scientist at Dyson Robot Learning Lab working with Stephen James.
I received my PhD from KAIST advised by Jinwoo Shin,
During my PhD, I was a visiting scholar at UC Berkeley working with Pieter Abbeel and Kimin Lee, collaborated with Honglak Lee at University of Michigan, and interned at
Microsoft Research Asia.
Feel free to send me an e-mail if you want to have a chat! Contact: mail AT younggyo.me
My research interest lies in reinforcement learning and representation learning, with the goal of
developing intelligent agents that can interact with the physical world and continually improve
their skills. To this end, my research has focused on designing (i) sample-efficient reinforcement
learning (RL) algorithms to enable agents to improve their skills and (ii) visual representation
learning methods from videos to allow agents to understand the world.
Papers on (i) sample-efficient RL algorithms are highligted in blue and (ii) representation
learning methods are highlighted in red.
We present Coarse-to-fine Q-Network with Action Sequence (CQN-AS), a value-based RL
algorithm that trains a critic network to output Q-values over a sequence of actions.
We present Coarse-to-fine Reinforcement Learning (CRL), which trains RL agents to zoom-into
continuous action space in a coarse-to-fine manner. Within this framework, we present
Coarse-to-fine Q-Network (CQN), a
value-based RL algorithm for continuous control.
We present BiGym, a new benchmark and learning environment for mobile bi-manual demo-driven
robotic manipulation. BiGym consists of 40 diverse tasks in home environment, and provides
human-collected
demonstrations.
We present RSP, a framework for visual representation learning from videos, that learns
representations that capture temporal information between frames by training a stochastic future
frame prediction model.
We introduce Render and Diffuse (R&D), a method that unifies low-level robot actions and RGB
observations within the image space using virtual renders of
the 3D model of the robot.
We introduce a new exploration technique that maximizes value-conditional state entropy, which
takes into account the value estimates of states for computing the intrinsic bonus.
We introduce RE3, an exploration technique for visual RL that utilizes a k-NN state entropy
estimate in the representation space of a randomly initialized & fixed CNN encoder.
We introduce T-MCL that learns multi-headed dynamics model whose each prediction head is
specialized in certain environments with similar dynamics, i.e., clustering environments.
We introduce CaDM that learns a context encoder to extract contextual information from
recent history with self-supervised losses of future and backward prediction.