Younggyo Seo

I am a Research Scientist at Dyson Robot Learning Lab in London, UK. I received my Ph.D in Artificial Intelligence from KAIST under the supervision of Jinwoo Shin. I also have been closely collaborating with Robot Learning Lab at UC Berkeley, where I was fortunate to be advised by Kimin Lee and Pieter Abbeel.

Previously, I collaborated with Honglak Lee at University of Michigan. I've also interned at Microsoft Research Asia as a research intern in Deep and Reinforcement Learning Group, hosted by Tao Qin. Before that, I graduated with B.A. in Economics from Seoul National University.

Feel free to send me an e-mail if you want to chat or collaborate with me!

Email  /  Google Scholar  /  Twitter  /  CV

profile photo

July '23  

I will attend RSS 2023 in person for presenting two workshop papers (MV-MWM and M3L), please let me know if you want to meet or chat there!

June '23  

I successfully defended my PhD thesis!

May '23  

Multi-View Masked World Models is accepted to ICML 2023!

Oct '22  

Masked World Models is accepted to CoRL2022. I'm also organizing a workshop on Pre-training Robot Learning at CoRL2022, hope to see you there!

My research interest lies in reinforcement learning, computer vision, and robot learning. In particular, I focus on developing algorithms to train visual control agents by learning world models that understand how the world works. Relevant topics include (i) pre-training for reinforcement learning, (ii) self-supervised representation learning from images and videos for world model learning, and (iii) improving the dynamics generalization of world models.
I am also broadly interested in diverse areas of reinforcement learning and imitation learning, such as exploration for RL, hierarchical RL, and preference-based RL. Representative papers are highlighted.


Accelerating Reinforcement Learning with Value-Conditional State Entropy Exploration
Dongyoung Kim, Jinwoo Shin, Pieter Abbeel, Younggyo Seo
Neural Information Processing Systems (NeurIPS), 2023.
Paper / Website / Code

We introduce a new exploration technique that maximizes value-conditional state entropy, which takes into account the value estimates of states for computing the intrinsic bonus.

Guide Your Agents with Adaptive Multimodal Rewards
Changyeon Kim, Younggyo Seo, Hao Liu, Lisa Lee, Honglak Lee, Jinwoo Shin, Kimin Lee
Neural Information Processing Systems (NeurIPS), 2023.
ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023.
Paper / website / code

We present ARP (Adaptive Return-conditioned Policy), which utilizes an adaptive multimodal reward signal for behavior learning.

Multi-View Masked World Models for Visual Robotic Manipulation
Younggyo Seo*, Junsu Kim*, Stephen James, Kimin Lee, Jinwoo Shin, Pieter Abbeel
International Conference on Machine Learning (ICML), 2023.
RSS Workshop on Experiment-oriented Locomotion and Manipulation Research, 2023 as Spotlight presentation.
Paper / website / code

We introduce MV-MWM that learns multi-view representations via masked view reconstruction and utilize them for visual robotic manipulation.

Language Reward Modulation for Pretraining Reinforcement Learning
Ademi Adeniji, Amber Xie, Carmelo Sferrazza, Younggyo Seo, Stephen James, Pieter Abbeel
Preprint, 2023
Paper / Code

We introduce LAMP, a method for pretraining RL agents by using multimodal reward signal from Video-Langauge models.

The Power of the Senses: Generalizable Manipulation from Vision and Touch through Masked Multimodal Learning
Carmelo Sferrazza, Younggyo Seo, Hao Liu, Youngwoon Lee, Pieter Abbeel
RSS Workshop on Interdisciplinary Exploration of Generalizable Manipulation Policy Learning: Paradigms and Debates, 2023
Paper will be ready soon

We introduce M3L, which learns joint representations of vision and touch with masked autoencoding .

Imitating Graph-Based Planning with Goal-Conditioned Policies
Junsu Kim, Younggyo Seo, Sungsoo Ahn, Kyunghwan Son, Jinwoo Shin
International Conference on Learning Representations (ICLR), 2023.
Paper / Code

We introduce PIG, a simple yet effective self-imitation scheme which distills a subgoal-conditioned policy into the target-goal-conditioned policy.

Masked World Models for Visual Control
Younggyo Seo, Danijar Hafner, Hao Liu, Fangchen Liu, Stephen James, Kimin Lee, Pieter Abbeel
Conference on Robot Learning (CoRL), 2022.
Paper / Website / Code

We introduce MWM that learns a latent dynamics model on top of an autoencoder trained with convolutional feature masking and reward prediction.

Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization
Changyeon Kim*, Junsu Kim*, Younggyo Seo, Kimin Lee, Honglak Lee, Jinwoo Shin
NeurIPS Workshop on Offline Reinforcement Learning, 2022.
Paper

We introduce DADT, which improves dynamics generalization of decision transformer by introducing a next-state prediction objective.

Reinforcement Learning with Action-Free Pre-Training from Videos
Younggyo Seo, Kimin Lee, Stephen James, Pieter Abbeel
International Conference on Machine Learning (ICML), 2022.
Paper / Website / Code

We introduce APV that can leverage diverse videos from different domains for pre-training to improve sample-efficiency.

HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator
Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel
International Conference on Image Processing (ICIP), 2022.
Paper

We introduce a video prediction model that can generate 256x256 frames by training an autorgressive transformer on top of VQ-GAN.

SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning
Jongjin Park, Younggyo Seo, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
International Conference on Learning Representations (ICLR), 2022.
Paper / Code

We introduce semi-supervised learning and temporal data augmentation for improving the feedback-efficiency of preferenced-based RL.

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning
Jongjin Park*, Younggyo Seo*, Chang Liu, Li Zhao, Tao Qin, Jinwoo Shin, Tie-Yan Liu
Neural Information Processing Systems (NeurIPS), 2021.
Paper / Article / Code

We introduce OREO, a regularization technique for behavior cloning from pixels.

Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning
Junsu Kim, Younggyo Seo, Jinwoo Shin
Neural Information Processing Systems (NeurIPS), 2021.
Paper / Code

We introduce HIGL, a goal-conditioned hierarchical RL method that samples landmarks and utilizes them for guiding the training of a high-level policy.

Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
Seunghyun Lee*, Younggyo Seo*, Kimin Lee, Pieter Abbeel, Jinwoo Shin
Conference on Robot Learning (CoRL), 2021.
NeurIPS Workshop on Offline RL, 2020 as Oral presentation.
Paper / Code

We introduce an offline-to-online RL algorithm to address the distribution shift that arises during the transition between offline RL and online RL.

State Entropy Maximization with Random Encoders for Efficient Exploration
Younggyo Seo*, Lili Chen*, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
International Conference on Machine Learning (ICML), 2021.
Paper / Website / Code

We introduce RE3, an exploration technique for visual RL that utilizes a k-NN state entropy estimate in the representation space of a randomly initialized & fixed CNN encoder.

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning
Younggyo Seo*, Kimin Lee*, Ignasi Clavera, Thanard Kurutach, Jinwoo Shin, Pieter Abbeel
Neural Information Processing Systems (NeurIPS), 2020.
Paper / Website / Code

We introduce T-MCL that learns multi-headed dynamics model whose each prediction head is specialized in certain environments with similar dynamics, i.e., clustering environments.

Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning
Kimin Lee*, Younggyo Seo*, Seunghyun Lee, Honglak Lee, Jinwoo Shin
International Conference on Machine Learning (ICML), 2020.
Paper / Website / Code

We introduce CaDM that learns a context encoder to extract contextual information from recent history with self-supervised losses of future and backward prediction.

Learning What to Defer for Maximum Independent Sets
Sungsoo Ahn, Younggyo Seo, Jinwoo Shin
International Conference on Machine Learning (ICML), 2020.
Paper / Code

We introduce LwD, a deep RL framework for the maximum independent set problem.


Research Scientist | Dyson Robot Learning Lab
Aug 2023 - Present

Dyson Robot Learning Lab

Visiting Student / Collaborator | UC Berkeley
June 2021 - Present

Berkeley Robot Learning Lab (working with Pieter Abbeel)

Research Intern | Microsoft Research Asia
Dec 2020 - May 2021

Deep and Reinforcement Learning Group (working with Chang Liu, Li Zhao, and Tao Qin)


Ph.D in Artificial Intelligence | KAIST
Sep 2019 - Aug 2023

B.A in Economics | Seoul National University
Mar 2012 - Feb 2019

  • Summa Cum Laude
  • Leave of absence for mandatory military service: Feb 2014 - Feb 2016
  • Big Data Fintech Expert Program (Jan 2019 - July 2019)


  • Top Reviewer Award (top 10%), ICML 2021, 2022
  • AI/CS/EE Rising Stars Award, Google Explore Computer Science Research, 2022
  • Best Paper Award, Korean Artificial Intelligence Association, 2021
  • Summa Cum Laude, Economics department, Seoul National University, 2019
  • National Humanities Scholarship, Korea Student Aid Foundation, 2012-2018
  • 1st Rank @ College Scholastic Ability Test with perfect score (500/500), 2011


Website templates from here and here.