Bowen Yu / 郁博文

Bowen Yu is an Algorithm Expert of Qwen, Alibaba Group. He is currently leading Qwen's post-training research and the development of the Qwen-Instruct models. In 2022, he earned his Ph.D. from the Institute of Information Engineering, Chinese Academy of Sciences in 2022, under the supervision of Professor Tingwen Liu and Bin Wang. His research primarily focuses on the Automated Alignment of Large Language Models. To date, he has published several papers in top-tier conferences and journals, including ICML, WWW, SIGIR, ACL, EMNLP, AAAI, TACL, and TOIS.

 Block C, Greenland center, Chaoyang District, Beijing, China

 yubowen.ph[at]gmail.com

Google Scholar  /  DBLP


Professional Services


Honors and Awards


Selected Publications

— 2024 —

Wider and Deeper LLM Networks are Fairer LLM Evaluators
Xinghua Zhang, Bowen Yu*, Haiyang Yu, Yangyu Lv, Tingwen Liu*, Fei Huang, Hongbo Xu and Yongbin Li
Transactions of the Association for Computational Linguistics

A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models
Qiaoyu Tang, Le Yu, Bowen Yu*, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun

Rethinking Data Selection at Scale: Random Selection is Almost All You Need
Tingyu Xia, Bowen Yu*, Kai Dang, An Yang, Yuan Wu, Yuan Tian, Yi Chang and Junyang Lin

Qwen2.5-Coder Technical Report
Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Kai Dang, An Yang, Rui Men, Fei Huang, Xingzhang Ren, Xuancheng Ren, Jingren Zhou and Junyang Lin

Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement
An Yang, Beichen Zhang, Binyuan Hui, Bofei Gao, Bowen Yu*, Chengpeng Li, Dayiheng Liu, Jianhong Tu, Jingren Zhou, Junyang Lin, Keming Lu, Mingfeng Xue, Runji Lin, Tianyu Liu, Xingzhang Ren and Zhenru Zhang

Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao, Feifan Song, Yibo Miao, Zefan Cai, Zhe Yang, Liang Chen, Helan Hu, Runxin Xu, Qingxiu Dong, Ce Zheng, Wen Xiao, Ge Zhang, Daoguang Zan, Keming Lu, Bowen Yu, Dayiheng Liu, Zeyu Cui, Jian Yang, Lei Sha, Houfeng Wang, Zhifang Sui, Peiyi Wang, Tianyu Liu and Baobao Chang

Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model
Chenhan Yuan, Fei Huang, Ru Peng, Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou

Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement
Le Yu, Bowen Yu*, Haiyang Yu, Fei Huang and Yongbin Li

Qwen2 technical report
An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfeng Xue, Na Ni, Pei Zhang, Peng Wang, Ru Peng, Rui Men, Ruize Gao, Runji Lin, Shijie Wang, Shuai Bai, Sinan Tan, Tianhang Zhu, Tianhao Li, Tianyu Liu, Wenbin Ge, Xiaodong Deng, Xiaohuan Zhou, Xingzhang Ren, Xinyu Zhang, Xipin Wei, Xuancheng Ren, Yang Fan, Yang Yao, Yichang Zhang, Yu Wan, Yunfei Chu, Yuqiong Liu, Zeyu Cui, Zhenru Zhang and Zhihao Fan

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu*, Chang Zhou and Jingren Zhou

Towards Scalable Automated Alignment of LLMs: A Survey
Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin and Bowen Yu*

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin and Chang Zhou

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu, Bowen Yu*, Haiyang Yu, Fei Huang and Yongbin Li
The Forty-first International Conference on Machine Learning (ICML 2024)Vienna, Austria, July 21–27, 2024.

Language Models can Evaluate Themselves via Probability Discrepancy
Tingyu Xia, Bowen Yu*, Yuan Wu, Yi Chang and Chang Zhou
The 62nd Annual Meeting of the Association for Computational Linguistics (Finds of ACL 2024)Bangkok, Thailand, August 11–16, 2024.

SoFA: Shielded On-the-fly Alignment via Priority Rule Following
Xinyu Lu, Bowen Yu*, Yaojie Lu, Hongyu Lin, Haiyang Yu, Le Sun, Xianpei Han and Yongbin Li
The 62nd Annual Meeting of the Association for Computational Linguistics (Finds of ACL 2024)Bangkok, Thailand, August 11–16, 2024.

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment
Keming Lu, Bowen Yu, Chang Zhou and Jingren Zhou
The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)Bangkok, Thailand, August 11–16, 2024.

Cross-domain NER under a Divide-and-Transfer Paradigm
Xinghua Zhang, Bowen Yu, Xin Cong, Taoyu Su, Quangang Li, Tingwen Liu and Hongbo Xu
ACM Transactions on Information Systems

TRUE-UIE: Two Universal Relations Unify Information Extraction Tasks
Yucheng Wang, Bowen Yu, Yilin Liu and Shudong Lu
The 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)Mexico City, Mexico, June 16–21, 2024.

Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
Feifan Song, Bowen Yu*, Hao Lang, Haiyang Yu, Fei Huang, Houfeng Wang and Yongbin Li
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING 2024)Torino, Italia, May 20–25,2024.

A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment
Yingxiu Zhao, Bowen Yu*, Binyuan Hui, Haiyang Yu, Fei Huang, Yongbin Li* and Nevin L. Zhang
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING 2024)Torino, Italia, May 20–25,2024.

Preference Ranking Optimization for Human Alignment
Feifan Song, Bowen Yu*,Minghao Li, Haiyang Yu, Fei Huang, Yongbin Li and Houfeng Wang*
The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)Vancouver, Canada, February 20–27,2024.

— 2023 —

Qwen Technical Report
Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou and Tianhang Zhu (Authors are ordered alphabetically by the last name.)

PolyLM: An Open Source Polyglot Large Language Model
Xiangpeng Wei, Haoran Wei, Huan Lin, Tianhao Li, Pei Zhang, Xingzhang Ren, Mei Li, Yu Wan, Zhiwei Cao, Binbin Xie, Tianxiang Hu, Shangjie Li, Binyuan Hui, Bowen Yu, Dayiheng Liu, Baosong Yang, Fei Huang and Jun Xie

API-Bank: A Benchmark for Tool-augmented LLMs
Minghao Li, Yingxiu Zhao, Bowen Yu*, Feifan Song, Hangyu Li, Haiyang Yu, Zhoujun Li, Fei Huang and Yongbin Li
The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)Singapore, December 6 –10,2023.

Causal Document-Grounded Dialogue Pre-training
Yingxiu Zhao, Bowen Yu*, Haiyang Yu, Bowen Li, Chao Wang, Fei Huang, Yongbin Li, Nevin L Zhang
The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)Singapore, December 6 –10,2023.

Diversify Question Generation with Retrieval-Augmented Style Transfer
Qi Gou, Zehua Xia, Bowen Yu Haiyang Yu, Fei Huang, Yongbin Li and Nguyen Cam-Tu
The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)Singapore, December 6 –10,2023.

Improving Question Generation with Multi-level Content Planning
Qi Gou, Zehua Xia, Bowen Yu Haiyang Yu, Fei Huang, Yongbin Li and Nguyen Cam-Tu
The 2023 Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP 2023)Singapore, December 6 –10,2023.

Unified Language Representation for Question Answering over Text, Tables, and Images
Bowen Yu, Cheng Fu, Haiyang Yu, Fei Huang and Yongbin Li
The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)Toronto, July 9-14,2023.

Universal Information Extraction with Meta-Pretrained Self-Retrieval
Xin Cong, Bowen Yu*, Mengcheng Fang, Tingwen Liu, Haiyang n/a Yu, Zhongkai Hu, Fei Huang, Yongbin Li and Bin Wang
The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)Toronto, July 9-14,2023.

Domain Incremental Lifelong Learning in an Open World
Yi Dai, Hao Lang, Yinhe Zheng, Bowen Yu, Fei Huang and Yongbin Li
The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)Toronto, July 9-14,2023.

Representation and Labeling Gap Bridging for Cross-lingual Named Entity Recognition
Xinghua Zhang, Bowen Yu, Jiangxia Cao, Quangang Li, Xuebin Wang, Tingwen Liu and Hongbo Xu
The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023)Taipei, 23-27 July 2023.

Learning Structural Co-occurrences for Structured Web Data Extraction in Low-Resource Settings
Zhenyu Zhang, Bowen Yu, Tingwen Liu, Tianyun Liu, Yubin Wang, and Li Guo
The Web Conference 2023 (WWW 2023)Austin, April 30 to May 4 2023.

Label-Aware Chinese Event Detection with Heterogeneous Graph Attention Networks
Shiyao Cui, Bowen Yu, Xin Cong, Tingwen Liu, Qingfeng Tan and Jinqiao Shi
Journal of Computer Science and Technology

Towards Universal Cross-Domain Recommendation
Jiangxia Caog, Shaoshuai Li, Bowen Yu, Xiaobo Guo, Tingwen Liu and Bin Wang
The 16th ACM International WSDM Conference (WSDM 2023), Singapore, February 27 to March 3 2023.

— 2022 —

Towards Generalized Open Information Extraction
Bowen Yu,Zhenyu Zhang, Jingyang Li, Haiyang Yu, Tingwen Liu, Jian Sun, Yongbin Li and Bin Wang
The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, December 7-11 2022.

Enhancing Joint Multiple Intent Detection and Slot Filling with Global Intent-Slot Co-occurrence
Mengxiao Song, Bowen Yu, Li Quangang, Wang Yubin, Tingwen Liu and Hongbo Xu
The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, December 7-11 2022.

Semi-Supervised Lifelong Language Learning
Yingxiu Zhao, Yinhe Zheng, Bowen Yu, Zhiliang Tian, Dongkyu Lee, Jian Sun, Yongbin Li and Nevin L. Zhang
The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, December 7-11 2022.

Prompt Conditioned VAE: Enhancing Generative Replay for Lifelong Learning in Task-Oriented Dialogue
Yingxiu Zhao, Yinhe Zheng, Zhiliang Tian, Chang Gao, Bowen Yu, Haiyang Yu, Yongbin Li, Jian Sun and Nevin L Zhang
The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), Abu Dhabi, December 7-11 2022.

Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration
Zhenyu Zhang#, Bowen Yu#, Haiyang Yu, Tingwen Liu, Cheng Fu, Jingyang Li, Chengguang Tang and Jian Sun
The 30th ACM International Conference on Multimedia (MM 2022), Lisbon, October 10-14 2022.

A Survey on Neural Open Information Extraction: Current Status and Future Directions
Shaowen Zhou, Bowen Yu, Aixin Sun, Cheng Long, Jingyang Li, Haiyang Yu, Jian Sun and Yongbin Li
The 31st International Joint Conference on Artificial Intelligence (IJCAI 2022), Vienna, July 23-29 2022.
Acceptance rate: 38/209=18.2%

Exploring Modular Task Decomposition in Cross-domain Named Entity Recognition
Xinghua Zhang, Bowen Yu, Tingwen Liu, Yubin Wang, Taoyu Su and Hongbo Xu
The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), Madrid, July 11-15 2022.
Acceptance rate: 161/794=20.3%

Relation-Guided Few-Shot Relational Triple Extraction
Xin Cong, Jiawei Sheng, Shiyao Cui, Bowen Yu, Tingwen Liu and Bin Wang
The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022), Madrid, July 11-15 2022.
Acceptance rate: 165/667=24.7%

Enhancing Chinese Pre-trained Language Model via Heterogeneous Linguistics Graph
Yanzeng Li#, Jiangxia Cao#, Xin Cong, Zhenyu Zhang, Bowen Yu, Hongsong Zhu, Tingwen Liu
The 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), Dublin, May 22-27 2022.

Document-Level Event Extraction via Human-like Reading Process
Shiyao Cui, Xin Cong, Bowen Yu, Tingwen Liu, Yucheng Wang and Jinqiao Shi
2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022), Virtual, May 7-13 2022.

— 2021 —

Maximal Clique Based Non-Autoregressive Open Information Extraction
Bowen Yu, Yucheng Wang, Tingwen Liu, Hongsong Zhu, Limin Sun and Bin Wang
The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), Online and in the Dominican Republic, November 7-11, 2021.

Improving Distantly-Supervised Named Entity Recognition with Self-Collaborative Denoising Learning
Xinghua Zhang, Bowen Yu, Tingwen Liu, Zhenyu Zhang, Jiawei Sheng, Xue Mengge and Hongbo Xu
The 2021 Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP 2021), Online and in the Dominican Republic, November 7-11, 2021.

NA-Aware Machine Reading Comprehension for Document-Level Relation Extraction
Zhenyu Zhang, Bowen Yu, Xiaobo Shu, Tingwen Liu
The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2021), Online, September 13-17, 2021.

Discontinuous Named Entity Recognition as Maximal Clique Discovery
Yucheng Wang#, Bowen Yu#, Hongsong Zhu, Tingwen Liu, Nan Yu, Limin Sun
The 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021), Bangkok, Thailand, August 1-6, 2021.

Few-Shot Event Detection with Prototypical Amortized Conditional Random Field
Xin Cong, Shiyao Cui, Bowen Yu, Tingwen Liu, Wang Yubin, Bin Wang
The 59th Annual Meeting of the Association for Computational Linguistics (Findings of ACL 2021), Bangkok, Thailand, August 1-6, 2021.

From What to Why: Improving Relation Extraction with Rationale Graph
Zhenyu Zhang, Bowen Yu, Xiaobo Shu, Xue Mengge, Tingwen Liu, Li Guo
The 59th Annual Meeting of the Association for Computational Linguistics (Findings of ACL 2021), Bangkok, Thailand, August 1-6, 2021.

CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction
Jiawei Sheng, Shu Guo, Bowen Yu, Qian Li, Yiming Hei, Lihong Wang, Tingwen Liu, Hongbo Xu
The 59th Annual Meeting of the Association for Computational Linguistics (Findings of ACL 2021), Bangkok, Thailand, August 1-6, 2021.

FITAnnotator: A Flexible and Intelligent Text Annotation System (Demo Paper)
Yanzeng Li, Bowen Yu, Li Quangang, Tingwen Liu
2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021), Online, June 6-11, 2021.

Semi-Open Information Extraction
Bowen Yu, Zhenyu Zhang, Jiawei Sheng, Tingwen Liu, Yubin Wang, Yucheng Wang and Bin Wang
The Web Conference 2021 (WWW 2021), Ljubljana, April 19-23, 2021.
Acceptance rate: 357/1736=20.6%

— 2020 —

Porous Lattice Transformer Encoder for Chinese NER   |   
Xue Mengge, Bowen Yu, Tingwen Liu, Yue Zhang, Erli Meng and Bin Wang
The 28th International Conference on Computational Linguistics (COLING 2020), Barcelona, Spain, December 8-13, 2020.

Learning to Prune Dependency Trees with Rethinking for Neural Relation Extraction   |   
Bowen Yu, Xue Mengge, Zhenyu Zhang, Tingwen Liu, Yubin Wang and Bin Wang
The 28th International Conference on Computational Linguistics (COLING 2020), Barcelona, Spain, December 8-13, 2020.

Document-level Relation Extraction with Dual-tier Heterogeneous Graph   |   
Zhenyu Zhang, Bowen Yu, Xiaobo Shu, Tingwen Liu, Hengzhu Tang, Yubin Wang and Li Guo
The 28th International Conference on Computational Linguistics (COLING 2020), Barcelona, Spain, December 8-13, 2020.

TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking   |   
Yucheng Wang, Bowen Yu*, Yueyang Zhang, Tingwen Liu, Hongsong Zhu and Limin Sun
The 28th International Conference on Computational Linguistics (COLING 2020), Barcelona, Spain, December 8-13, 2020.

Coarse-to-Fine Pre-training for Named Entity Recognition   |   
Xue Mengge, Bowen Yu, Zhenyu Zhang, Tingwen Liu, Yue Zhang and Bin Wang
The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), Online, November 16-20, 2020.
Acceptance rate: 754/3114=24.82%

Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation   |   
Shiyao Cui, Bowen Yu, Tingwen Liu, Zhenyu Zhang, Xuebin Wang and Jinqiao Shi
Findings of the 2020 Conference on Empirical Methods in Natural Language Processing (Findings of EMNLP 2020), Online, November 16-20, 2020.

Inductive Unsupervised Domain Adaptation for Few-Shot Classification via Clustering   |   
Xin Cong, Bowen Yu, Tingwen Liu, Shiyao Cui, Hengzhu Tang, Bin Wang
The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2020), Ghent, Belgium, September 14-18, 2020.
Acceptance rate: 131/687=19.1%

Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention (Short Paper)   |   
Yanzeng Li, Bowen Yu, Mengge Xue, Tingwen Liu
The 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), July 5-10, 2020.
Acceptance rate: 208/1185=17.6%

DRG2vec: Learning Word Representations from Definition Relational Graph  |  
Xiaobo Shu, Bowen Yu, Zhenyu Zhang, Tingwen Liu
The 2020 International Joint Conference on Neural Networks (IJCNN 2020), Glasgow, UK, July 19–24, 2020.

Strong Baselines for Author Name Disambiguation with and without Neural Networks  |   
Zhenyu Zhang, Bowen Yu, Tingwen Liu, Dong Wang
The 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2020), pages 369-381, Singapore, May 11-14, 2020.
Acceptance rate: 135/628=21.5%

Joint Extraction of Entities and Relations Based on a Novel Decomposition Strategy  |   
Bowen Yu, Zhenyu Zhang, Xiaobo Shu, Tingwen Liu, Yubin Wang, Bin Wang and Sujian Li
The 24th European Conference on Artificial Intelligence (ECAI 2020), Santiago de Compostela, Spain, August 29-September 5,2020.
Acceptance rate: 365/1363=26.8%

Distilling Knowledge from Well-informed Soft Labels for Neural Relation Extraction  |   
Zhenyu Zhang, Xiaobo Shu, Bowen Yu, Tingwen Liu, Jiapeng Zhao, Quangang Li, Li Guo
The 34th AAAI Conference on Artificial Intelligence (AAAI 2020), pages 9620-9627, New York, USA, February 7-12,2020.
Acceptance rate: 1591/7737=20.6%

基于图卷积记忆网络的方面级情感分类  |  
王光, 李鸿宇, 邱云飞, 郁博文, 柳厅文
中文信息学报

— 2019 —

Beyond Word Attention: Using Segment Attention in Neural Relation Extraction  |   
Bowen Yu, Zhenyu Zhang, Tingwen Liu, Bin Wang, Sujian Li and Quangang Li
The 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), pages 5401-5407, Macao, China, August 10-16, 2019.
Acceptance rate: 850/4752=17.9%

-->