I am currently an intern at M-A-P community, working on Multi-agent systems and AI alignment research. Multimodal Art Projection (M-A-P) is an open-source AI research community. Recently I completed my M.Sc. in Big Data Technology from The Hong Kong University of Science and Technology (HKUST) in October 2024, and received my B.Eng. in Computer Science and Technology from Zhejiang University in September 2023.

My research interests primarily focus on LLM agents and LLM Reasoning. I am particularly passionate about developing intelligent systems that can collaborate effectively and understand complex programming tasks. Previously, I worked as a research intern at HKGAI, HKUST (December 2023 - May 2024) under the supervision of Professor Jie Fu.

I am currently seeking PhD positions for 2025 Fall. If you are interested in my research work, or if you have any questions about potential collaboration, please feel free to reach out at 15071186776@163.com.

🔥 News

  • 2024.10:  🎉🎉 Released AutoKaggle, a multi-agent framework for automated data science tasks.

📝 Publications

sym

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

Ziming Li, Qianbo Zang, David Ma, Jiawei Guo, Tuney Zheng, Minghao Liu, Xinyao Niu, Yue Wang, Jian Yang, Jiaheng Liu, Wanjun Zhong, Wangchunshu Zhou, Wenhao Huang, Ge Zhang

Project | GitHub stars

  • Existing automated data science systems are oversimplified, inflexible, and lack transparency, making them inadequate for complex real-world data science tasks.
  • We develop AutoKaggle, an end-to-end multi-agent framework that integrates phase-based workflows, iterative debugging, and comprehensive reporting systems to provide data scientists with efficient, reliable, and transparent automated solutions.
sym

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Jiawei Guo, Ziming Li, Xueling Liu, Kaijing Ma, Tianyu Zheng, Zhouliang Yu, Ding Pan, Yizhi LI, Ruibo Liu, Yue Wang, Shuyue Guo, Xingwei Qu, Xiang Yue, Ge Zhang, Wenhu Chen, Jie Fu

Project | GitHub stars

  • Existing frameworks focus heavily on code generation, lacking systematic evaluation of code editing abilities. Code editing is crucial in software development, requiring dedicated evaluation benchmarks.
  • We establish CodeEditorBench as the first comprehensive framework for evaluating LLMs’ code editing capabilities and advance code editing technology through open-source datasets and evaluation tools.
ACL 2024
sym

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning

Xiangru Tang, Anni Zou, Zhuosheng Zhang, Ziming Li, Yilun Zhao, Xingyao Zhang, Arman Cohan, Mark Gerstein

Project | GitHub stars

  • Current LLMs face challenges in medical domains due to limited domain knowledge and reasoning capabilities, necessitating a solution that doesn’t require additional training.
  • We design a framework that simulates multi-disciplinary expert collaboration through role-playing and multi-round discussions to activate LLMs’ latent medical knowledge and enhance their reasoning abilities.

📖 Educations

  • 2023.09 - 2024.10, M.Sc. in Big Data Technology, The Hong Kong University of Science and Technology (HKUST)
  • 2019.09 - 2023.06, B.Eng. in Computer Science and Technology, College of Computer Science and Technology, Zhejiang University

💻 Internships

  • 2024.06 - present, M-A-P
  • 2023.12 - 2024.05, HKGAI, HKUST