News

Recently, our laboratory has proposed a new large language model (LLM) named IvyGPT, which aims to improve medical applications. It is reported that although general LLMs such as ChatGPT have achieved remarkable success in various fields, their application in the medical field is not widespread due to their low accuracy and inability to provide medical advice. IvyGPT is an LLM based on LaMA, which our laboratory trains and fine-tunes using high-quality medical question-answer (QA) instances and human feedback reinforcement learning (RLHF). After supervised fine-tuning, IvyGPT exhibits good multi-turn dialogue ability, but it cannot perform tasks like doctors in other aspects such as comprehensive diagnosis. Through RLHF, IvyGPT can output rich diagnosis and treatment answers that are closer to human levels. Our laboratory used the QLoRA method during the training process and trained 3.3 billion parameters on a small number of NVIDIA A100 (80GB) GPUs. Experiments have shown that compared with other medical GPT models, IvyGPT performs better in the medical field. The results of this study provide new possibilities for introducing large language models into the medical field, which provide more accurate and comprehensive medical services for doctors and patients. CMB is a comprehensive, multi-level Medical Benchmark in Chinese. It encompasses 280,839 multiple-choice questions and 74 complex case consultation questions, covering all clinical medical specialties and various professional levels. The platform aims to holistically evaluate a model's medical knowledge and clinical consultation capabilities. IvyGPT achieved a score of 38.54 in the CMB, surpassing the performance of ChatGPT.

CMB: https://cmedbenchmark.llmzoo.com/

IvyGPT won the CICAI 2023 Demo Paper demonstration and achieved excellent performance in the CMB medical evaluation list