About Yao Fu
I am a research scientist at Google DeepMind.
I did my Ph.D. study at the University of Edinburgh (2020-2024) with professor Mirella Lapata. I finished my M.S. at Columbia University (2018-2020) with professor John Cunningham and my B.S. at Peking University (2013-2018) with professor Yansong Feng. Before Ph.D., I spent great time visiting professor Alexander Rush at Cornell Tech (2019-2020).
During my PhD study, I developed methods for complex reasoning like complexity-based prompting and question decomposition; making smaller models reason better by CoT specialization; self-play multi-agent debate like GPT-Bargaining. My blog poses the connection between code and reasoning in “early days”. I also studied long-context continual pretraining and efficient deployment recipes, and identified retrieval heads that mechanistically explain long-context factuality.
I am interested in large-scale generative models for human intelligence. My research objective is to make large multimodal models the next generation computational platforms and become generally capable agents. I am broadly interested in scaling, long-context, multimodal, reasoning and efficiency.
Featured Research
Arxiv 2024 | Retrieval Head Mechanistically Explains Long-Context Factuality [code][paper][Twitter/X]
- Wenhao Wu, Yizhong Wang, Guangxuan Xiao, Hao Peng and Yao Fu
- A systematic investigation upon a wide range of models reveals the existance retrieval heads, a special type of attention heads accounting for long-context factuality.
- Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim and Hao Peng
- An effective and affordable recipe for training language models to 128K context, the key is to continue pretrain the full-attention model on 5B per-source length-upsampled data.
- The first open-sourced model matching GPT-4 128K performance on Needle-in-a-Haystack.
Arxiv 2023 | Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback [code][paper]
- Yao Fu, Hao Peng, Tushar Khot, and Mirella Lapata
- Two language models negotiate with each other and continuously improve their negotiation strategies by multi-round game playing and iterative in-context learning from AI feedback.
- Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, and Tushar Khot
- Trading language model’s generic ability for specialized math chain-of-thought ability.
- Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark and Tushar Khot
- State-of-the-art reasoning performance on math word problems by prompting GPT3 with instances of complex reasoning chains.
- Utterance