Speaker: Yaoxin Wu, Assistant Professor, Eindhoven University of Technology, The Netherlands
Date: 8 July 2024
Time: 9:00 - 10:00am (Central European Summer Time, UTC+2)
Dr. Yaoxin Wu is an assistant professor in the Information Systems group at Eindhoven University of Technology. He received his Ph.D. degree in computer science from Nanyang Technological University, Singapore, in 2023. He was a Research Associate with the Singtel Cognitive and Artificial Intelligence Lab for Enterprises (SCALE@NTU). He has published research papers in leading journals (IEEE TNNLS, IEEE TKDE, IEEE TITS, IEEE TVT, etc.) and top-tier AI conferences (NeurIPS, ICML, ICLR, AAAI, UAI, AAMAS, etc.). In addition, he has served as PC member in NeurIPS, ICML, ICLR, AAAI, IJCAI, ECAI, etc, and Area Chair in IEEE Conference on Artificial Intelligence. He has also served as reviewer for prestigious journals in AI/OR fields, such as Transportation Research Part E, IEEE Transactions on Cybernetics, Annals of Operations Research, Transactions on Neural Networks and Learning Systems, etc. His research interests mainly include deep learning, combinatorial optimization, multi-objective optimization, and integer programming.
Existing deep reinforcement learning (DRL) methods for multi-objective vehicle routing problems (MOVRPs) typically decompose an MOVRP into subproblems with respective preferences and then train policies to solve corresponding subproblems. However, such a paradigm is still less effective in tackling the intricate interactions among subproblems, thus holding back the quality of the Pareto solutions. To counteract this limitation, we introduce a collaborative deep reinforcement learning method. We first propose a preference-based attention network (PAN) that allows the DRL agents to reason out solutions to subproblems in parallel, where a shared encoder learns the instance embedding and a decoder is tailored for each agent by preference intervention to construct respective solutions. Then, we design a collaborative active search (CAS) to further improve the solution quality, which updates only a part of the decoder parameters per instance during inference. In the CAS process, we also explicitly foster the interactions of neighboring DRL agents by imitation learning, empowering them to exchange insights of elite solutions to similar subproblems. Extensive results on random and benchmark instances verified the efficacy of PAN and CAS, which is particularly pronounced on the configurations (i.e., problem sizes or node distributions) beyond the training ones.
The Webinar went very successful, with 50+ participants.
The video recording of the Webinar can be found here.
Back to the Webinar Series