![]() |
| 图源:OralGPT-Omni: A Versatile Dental Multimodal Large Language Model |
Good Dental News |Recently, a research team from the Faculty of Dentistry at the University of Hong Kong, in collaboration with Shenzhen University, the Hong Kong University of Science and Technology (Guangzhou), and other institutions, published a paper introducing OralGPT-Omni, the first dental-specific multimodal large language model.
This model can process eight types of dental images, including intraoral photographs, periapical radiographs, lateral cephalometric radiographs, pathological sections, and intraoral videos. It covers five tasks: lesion diagnosis, CVM staging prediction, treatment plan recommendation, video understanding, and tooth position and counting.
The research team designed the TRACE-CoT (Transparent Radiology Analysis and Clinical Evidence) reasoning model, enabling AI to mimic the diagnostic thinking process of radiologists. This reasoning model comprises five steps: imaging examination, hypothesis generation, medical expertise reference, feature-based validation, and evidence-based conclusion. Two dentists conducted a quality assessment of 300 TRACE-CoT data samples, demonstrating its high quality and reliability.
![]() |
| 图源:OralGPT-Omni: A Versatile Dental Multimodal Large Language Model |
Notably, the research team collected data from 31 publicly available datasets and one dental hospital in Hong Kong, constructing a large dataset containing approximately 3.21 million text words, nearly 60,000 images, and 90 videos. A four-stage training strategy was employed: dental knowledge injection, dental concept alignment, supervised fine-tuning, and reinforcement learning optimization. In the reinforcement learning stage, the team also introduced a difficulty-aware data selection strategy and a TRACE reward mechanism.
Furthermore, the research team constructed the MMOral-Uni benchmark, which includes 2,809 open-ended question-and-answer pairs validated by professional dentists, covering five imaging modalities and five task categories. OralGPT-Omni scored 51.84 points on this benchmark and 45.31 points on the MMOral-OPG panoramic radiograph benchmark. Comparative data shows that GPT-5 scored only 15.42 points on the MMOral-Uni benchmark.
The research team reportedly invited a radiologist with over 10 years of experience to evaluate the clinical effectiveness of the outputs from the three models: GPT-5, Lingshu-7B, and OralGPT-Omni. The evaluation results showed that OralGPT-Omni received high ratings in terms of accuracy and clinical usability.
The research team pointed out that OralGPT-Omni still has limitations in developing complex treatment plans. Treatment plans need to consider more individualized factors such as the patient's overall health status, financial capacity, and treatment preferences. Treatment plan data only accounted for 0.006% of the training data. Furthermore, the research team stated that they will release all code, data, and models publicly.
Scan the QR code below to add our assistant and get access to documents such as "OralGPT-Omni: A Versatile Dental Multimodal Large Language Model".