Hi! I am Tianyi :)
I conduct machine learning research on AI safety and alignment, with a special focus on moral values in AI systems.
I am currently a junior at Peking University, as a member of the Turing Class in Computer Science.
My answer to the Hamming question (“What are the most important problems [that you should probably work on]?”)
How do we prevent premature lock-in of our current moral values into AI systems, and leave room for progress?
How do we combine theory & experimental validation to help resolve fundamental disagreements (e.g. those on misgeneralization and deceptive alignment) in the field of AI safety and alignment, just like the way physicists resolve their disagreements?
How do we discover currently neglected challenges facing AI safety and alignment?
I strive to become a “full-stack researcher”, hoping to combine experimental, mathematical, and conceptual analysis to approach these problems. I have just started, and there is definitely a long, long way to go.
Project: Generalization analysis in alignment training via induced Bayesian network (IBN)
It is well known that classical generalization analysis doesn't work on deep neural nets without prohibitively strong assumptions. This project tries to develop an alternative: an empirically-grounded model of generalization in RLHF that can derive formal generalization bounds while taking into account fine-grained information structures. We call this model the induced Bayesian network (IBN). From it, we derive and implement a simple algorithm for LLM reward modeling that is experimentally demonstrated to be advantageous.
I served as project lead, lead theory author, and LLM experimentation contributor in this project.
Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective
Tianyi Qiu§*, Fanzhi Zeng*, Jiaming Ji*, Dong Yan*, Kaile Wang, Jiayi Zhou, Han Yang, Juntao Dai, Xuehai Pan, Yaodong Yang (preprint) (§Project lead and lead theory author, *Equal technical contribution)
Project: Surveying the AI safety & alignment field
Since early 2023 when the alignment field started to undergo rapid growth, there has not yet been a comprehensive review article surveying the field. Thus, we have conducted a review that aims to be as comprehensive as possible, all the while constructing a unified framework (the alignment cycle). We emphasize the alignment of both contemporary AI systems and more advanced systems that pose more serious risks. Since its publication, it has seen citation by a report from NIST (among many others), and has been featured in various high-profile venues in China and Singapore.
I co-led this project.
AI Alignment: A Comprehensive Survey
Jiaming Ji*, Tianyi Qiu*, Boyuan Chen*, Borong Zhang*, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao (preprint) (*Equal contribution)
You may head for my Google Scholar profile to view the stats!
Aug 2020: Won a gold medal (ranked #26) in the Chinese National Olympiad in Informatics 2020
Mar 2021: Started as a visiting student as Peking University (that's why I'm currently already close to getting all my credits!)
Nov 2021: Started reading and thinking a lot about AI safety/alignment
Sept 2022: Officially started as an undergraduate student at Peking University, as a member of the Turing Class
June 2023: Started working with the PKU Alignment and Interaction Research Lab, advised by Prof. Yaodong Yang
June 2024 (est.): Start as a research intern at Center for Human-Compatible AI, UC Berkeley, co-advised by Micah and Cam
Sept 2024 (est.): Start as an exchange student at University of California, via the UCEAP reciprocity program with PKU
June 2026 (est.): Graduation, and hopefully starting my PhD :)
Please feel free to reach out! If you are on the fence about getting in touch, consider youself encouraged to do so :)
I can be reached at the email address qiutianyi.qty@gmail.com , or on twitter via the handle @Tianyi_Alex_Qiu .