Rongsheng Wang
Ph.D., The Chinese University of Hong Kong, Shenzhen (2025)
I am Rongsheng Wang (王荣胜), currently pursuing my PhD at The Chinese University of Hong Kong, Shenzhen. My primary research interests lie in Large Language Models (LLMs) and Multimodal LLMs (MLLMs). I love open source and sharing useful knowledge with everyone. I have been featured in China-Ranking and recognized as an outstanding individual developer on GitHub in China. I actively contribute to the open-source community on 👾GitHub , where I’ve led or contributed to several notable projects, including ChatPaper
, XrayGLM
, Awesome-LLM-Resources
, and TinyDeepSeek
. I also share datasets and model weights from these projects on 🤗HuggingFace.
The Chinese University of Hong Kong, Shenzhen
Ph.D. in Computational Biology and Health Informatics Sep. 2025 - Now
Macao Polytechnic University
M.S. in Big Data and Internet of Things Sep. 2022 - Jul. 2024
Henan Polytechnic University
B.S. in Computer Science (AI) Sep. 2018 - Jul. 2022
CUHK (SZ)
Research Assistant (Supervisor is Benyou Wang) Sep. 2024 - Sep. 2025
Qiyuan.Tech
CTO Oct. 2023 - Now
Ziyi Zeng, Zhenyang Cai, Yixi Cai, Xidong Wang, Junying Chen, Rongsheng Wang, Yipeng Liu, Siqi Cai, Benyou Wang†, Zhiguo Zhang, Haizhou Li(† corresponding author)
arXiv 2025 Conference
We introduce WaveMind, a multimodal large language model that unifies EEG and paired modalities in a shared semantic space for generalized, conversational brain-signal interpretation.
Junying Chen, Zhenyang Cai, Zhiheng Liu, Yunjin Yang, Rongsheng Wang, Qingying Xiao, Xiangyi Feng, Zhan Su, Jing Guo, Xiang Wan, Guangjun Yu, Haizhou Li, Benyou Wang†(† corresponding author)
arXiv 2025 Conference
We introduce ShizhenGPT, the first multimodal LLM tailored for Traditional Chinese Medicine, designed to overcome data scarcity and enable holistic perception across text, images, audio, and physiological signals for advanced TCM diagnosis and reasoning.
Shunian Chen, Hejin Huang, Yexin Liu, Zihan Ye, Pengcheng Chen, Chenghao Zhu, Michael Guan, Rongsheng Wang, Junying Chen, Guanbin Li, Ser-Nam Lim, Harry Yang, Benyou Wang†(† corresponding author)
arXiv 2025 Conference
We introduce TalkVid, a large-scale, high-quality, and demographically diverse video dataset with an accompanying benchmark that enables more robust, fair, and generalizable audio-driven talking head synthesis.
Qimin Yang, Huan Zuo, Runqi Su, Hanyinghong Su, Tangyi Zeng, Huimei Zhou, Rongsheng Wang, Jiexin Chen, Yijun Lin, Zhiyi Chen, Tao Tan†(† corresponding author)
Scientific Reports 2025 Journal
We proposes a two-step retrieval-augmented generation framework combining embedding search and Elasticsearch with ColBERTv2 ranking, achieving a 10% accuracy boost in complex medical queries while addressing real-time deployment challenges.
Rongsheng Wang, Junying Chen, Ke Ji, Zhenyang Cai, Shunian Chen, Yunjin Yang, Benyou Wang†(† corresponding author)
arXiv 2025 Conference
We introduce MedVideoCap-55K, the first large-scale, diverse, and caption-rich dataset designed for medical video generation. Comprising over 55,000 curated clips from real-world clinical scenarios, it addresses the critical need for both visual fidelity and medical accuracy in applications such as training, education, and simulation.