学术动态
当前位置: 365bet > 学术动态 > 正文
香港中文大学(深圳)张纵辉教授学术报告通知
发布时间 : 2024-06-21     点击量:

报告题目Aligning Model with Human Feedback: A Ranking based Zeroth-order Optimization Method

报告人:张纵辉 教授 香港中文大学(深圳)

报告时间2024624日(周一),上午10:00

报告地点:兴庆校区数学楼2-3会议室

 

报告摘要In this study, we delve into an emerging optimization challenge involving a black-box objective function that can only be gauged via a ranking oracle—a situation frequently encountered in real-world scenarios, especially when the function is evaluated by human judges. A prominent instance of such a situation is Reinforcement Learning with Human Feedback (RLHF), an approach recently employed to enhance the performance of Large Language Models (LLMs) using human. We introduce ZO-RankSGD, an innovative zeroth-order optimization algorithm designed to tackle this optimization problem, accompanied by theoretical assurances. Our algorithm utilizes a novel rank-based random estimator to determine the descent direction and guarantees convergence to a stationary point. Last but not least, we demonstrate the effectiveness of ZO-RankSGD in a novel application: improving the quality of images generated by a diffusion generative model with human ranking feedback. Throughout experiments, we found that ZO-RankSGD can significantly enhance the detail of generated images with only a few rounds of human feedback. Overall, our work advances the field of zeroth-order optimization by addressing the problem of optimizing functions with only ranking feedback, and offers a new and effective approach for aligning Artificial Intelligence (AI) with human intentions.

个人简介Tsung-Hui Chang is a Full Professor and Associate Dean (Education) at the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China, and Shenzhen Research Institute of Big Data. His research interests lie in optimization problems in data communications and machine learning. He is an Elected Member of IEEE SPS SPCOM TC and the Founding Chair of IEEE SPS ISAC TWG. He received the IEEE ComSoc Asian-Pacific Outstanding Young Researcher Award in 2015, and the IEEE SPS Best Paper Award twice in 2018 and 2021. He is currently a Senior Area Editor of IEEE TSP and an Associate Editor of IEEE OJSP. He is a Fellow of IEEE.

陕西省西安市碑林区咸宁西路28号     版权所有 :365bet·(中国)官方网站

邮编:710049     电话 :86-29-82668551     传真:86-29-82668551