Cambridge CB2 1PZ
My name is Stephen Chung. I am now studying as a PhD student at the University of Cambridge, supervised by David Krueger. My primary research interest includes reinforcement learning (RL), biologically-inspired machine learning, and AI alignment.
My PhD focuses on building AI that can reason and plan like humans. When faced with an unfamiliar situation, we may think of several possible actions and simulate the corresponding future (e.g., what will happen if I hit the tennis ball from the left and right angles?), thereby allowing us to choose the action with the optimal result. However, in familiar situations like driving home, we may rely solely on our habits without overthinking. This distinction demonstrates that planning should be a flexible and learnable process instead of a fixed one. I am currently studying how to build AI that learns this planning process by interacting with the environment and how such learnable self-interaction may possibly yield more powerful cognitive capabilities such as reasoning, dreaming, and thinking. I also argue that explicitly teaching an AI to plan, instead of relying on an AI to learn to plan within a large neural network in a black-box manner (as is done in the current large-language models), is safer as we can have more control over the planning process. You can find details about this research here.
Before coming to Cambridge, I graduated from the University of Massachusetts Amherst with a master’s degree in 2021. During my master’s years, I was supervised by Andrew Barto, and studied methods to train a deep neural network without backpropagation efficiently based on coagent networks.
As for my interest, I love reading Western and Chinese philosophy books, such as Zhuangzi and Nietzche. I enjoy thinking about the world and philosophical questions. I also like playing tennis and hiking!
Thinker: Learning to Plan and ActIn Advances in Neural Information Processing Systems, 2023
Learning by competition of self-interested reinforcement learning agentsIn Proceedings of the AAAI Conference on Artificial Intelligence, 2022
MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement Learning AgentsIn Advances in Neural Information Processing Systems, 2021