Treffer: Can artificial intelligence chatbots think like dentists? A comparative analysis based on dental specialty examination questions in restorative dentistry.
Al Kuwaiti A, Nazer K, Al-Reedy A, Al-Shehri S, Al-Muhanna A, Subbarayalu AV, et al. A review of the role of artificial intelligence in healthcare. J Pers Med. 2023;13(6):951.
Zuhair V, Babar A, Ali R, Oduoye MO, Noor Z, Chris K, et al. Exploring the impact of artificial intelligence on global health and enhancing healthcare in developing nations. J Prim Care Community Health. 2024;15:21501319241245847.
Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ. 2023;23(1):689.
Huang H, Zheng O, Wang D, Yin J, Wang Z, Ding S, et al. ChatGPT for shaping the future of dentistry: the potential of multi-modal large language model. Int J Oral Sci. 2023;15(1):29.
Rossettini G, Rodeghiero L, Corradi F, Cook C, Pillastrini P, Turolla A, et al. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study. BMC Med Educ. 2024;24(1):694.
Vaishya R, Iyengar KP, Patralekh MK, Botchu R, Shirodkar K, Jain VK, et al. Effectiveness of AI-powered chatbots in responding to orthopaedic postgraduate exam questions—an observational study. Int Orthop. 2024;48(8):1963–9.
Sallam M, Salim NA, Barakat M, Al-Tammemi AaB. ChatGPT applications in medical, dental, pharmacy, and public health education: a descriptive study highlighting the advantages and limitations. Narra j. 2023;3(1):e103.
Park J-C, Kwon H-JE, Chung CW. Innovative digital tools for new trends in teaching and assessment methods in medical and dental education. J Educ Eval Health Prof. 2021;18:13.
Kerimbayev N, Adamova K, Shadiev R, Altinay Z. Intelligent educational technologies in individual learning: a systematic literature review. Smart Learning Environments. 2025;12(1):1.
Tassoker M. ChatGPT-4 Omni’s superiority in answering multiple-choice oral radiology questions. BMC Oral Health. 2025;25(1):173.
Bhayana R. Chatbots and large language models in radiology: a practical primer for clinical and research applications. Radiology. 2024;310(1):e232756.
Sismanoglu S, Capan BS. Performance of artificial intelligence on Turkish dental specialization exam: can ChatGPT-4.0 and gemini advanced achieve comparable results to humans? BMC Med Educ. 2025;25(1):214.
Yilmaz BE, Gokkurt Yilmaz BN, Ozbey F. Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis. BMC Oral Health. 2025;25(1):573.
Assessment, Selection and Placement Center (ÖSYM). DUS Guide. https://www.osym.gov.tr/TR,12345/2025duskilavuzu.html. Accessed 15 Sept 2025.
Ekici Ö. Retrospective analysis of oral and maxillofacial surgery questions asked in the dentistry specialization training entrance exam. Turkiye Klinikleri Dishekimligi Bilimleri Dergisi. 2024;30(4):564–71.
Sarı MBD, Sezer B. ChatGPT-4 omni’s accuracy in multiple-choice dentistry questions: a multidisciplinary and bilingual assessment. Essentials Dentistry. 2025;4(1):1–9.
Temiz M, Güzel C. Assessing the performance of ChatGPT on dentistry specialization exam questions: a comparative study with DUS examinees. Med Records. 2025;7(1):162–6.
El-Kishawi M, Khalaf K, Winning T. How to improve fine motor skill learning in dentistry. Int J Dent. 2021;2021(1):6674213.
Najeeb M, Islam S. Artificial intelligence (AI) in restorative dentistry: current trends and future prospects. BMC Oral Health. 2025;25(1):592.
Tosun B, Yilmaz ZS. Comparison of artificial intelligence systems in answering prosthodontics questions from the dental specialty exam in Turkey. J Dent Sci. 2025;20(3):1454–59.
Künzle P, Paris S. Performance of large language artificial intelligence models on solving restorative dentistry and endodontics student assessments. Clin Oral Invest. 2024;28(11):575.
Lin CC-C, Sun J-S, Chang C-H, Chang Y-H, Chang JZ-C. Performance of artificial intelligence chatbots in National dental licensing examination. J Dent Sci. 2025;20(4):2307–14.
Dashti M, Ghasemi S, Ghadimi N, Hefzi D, Karimian A, Zare N, et al. Performance of ChatGPT 3.5 and 4 on US dental examinations: the INBDE, ADAT, and DAT. Imaging Sci Dent. 2024;54(3):271.
Kinikoglu I. Evaluating ChatGPT and Google gemini performance and implications in Turkish dental education. Cureus. 2025;17(1):e77292.
Marcaccini G, Seth I, Xie Y, Susini P, Pozzi M, Cuomo R, et al. Breaking bones, breaking barriers: ChatGPT, DeepSeek, and Gemini in hand fracture management. J Clin Med. 2025;14(6):1983.
Rahman A, Mahir SH, Tashrif MTA, Aishi AA, Karim MA, Kundu D et al. Comparative analysis based on deepseek, chatgpt, and Google gemini: Features, techniques, performance, future prospects. ArXiv Preprint arXiv. 2025:250304783.
Chau RCW, Thu KM, Yu OY, Hsung RT-C, Wang DCP, Man MWH et al. Evaluation of chatbot responses to Text-Based Multiple-Choice questions in prosthodontic and restorative dentistry. Dentistry J. 2025;13(7):279.
Esmailpour H, Rasaie V, Babaee Hemmati Y, Falahchai M. Performance of artificial intelligence chatbots in responding to the frequently asked questions of patients regarding dental prostheses. BMC Oral Health. 2025;25(1):574.
Giannos P. Evaluating the limits of AI in medical specialisation: chatgpt’s performance on the UK neurology specialty certificate examination. BMJ Neurol Open. 2023;5(1):e000451.
Takita H, Kabata D, Walston SL, Tatekawa H, Saito K, Tsujimoto Y et al. Diagnostic performance comparison between generative AI and physicians: a systematic review and meta-analysis. Medrxiv. 2024:2024.01. 20.24301563.
Guo LL, Pfohl SR, Fries J, Johnson AE, Posada J, Aftandilian C, et al. Evaluation of domain generalization and adaptation on improving model robustness to temporal dataset shift in clinical medicine. Sci Rep. 2022;12(1):2726.
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLoS Digit Health. 2023;2(2):e0000198.
Buhr CR, Smith H, Huppertz T, Bahr-Hamm K, Matthias C, Blaikie A, et al. ChatGPT versus consultants: blinded evaluation on answering otorhinolaryngology case–based questions. JMIR Med Educ. 2023;9(1):e49183.
Jiang Y, Xu Y, Guo J, Liu Y, Li R, editors. An intelligent question and answering system for dental healthcare. International conference on broadband communications, networks and systems. Cham: Springer; 2019.
Rokhshad R, Zhang P, Mohammad-Rahimi H, Pitchika V, Entezari N, Schwendicke F. Accuracy and consistency of chatbots versus clinicians for answering pediatric dentistry questions: a pilot study. J Dent. 2024;144:104938.
Wei J, Kim S, Jung H, Kim Y-H. Leveraging large language models to power chatbots for collecting user self-reported data. Proc ACM Hum Comput Interact. 2024;8(CSCW1):1–35.
Thelwall M. Evaluating research quality with large language models: an analysis of chatgpt’s effectiveness with different settings and inputs. J Data Inf Sci. 2025;10(1):7–25.
Zhang S, Song J. A chatbot based question and answer system for the auxiliary diagnosis of chronic diseases based on large language model. Sci Rep. 2024;14(1):17118.
Eraslan R, Ayata M, Yagci F, Albayrak H. Exploring the potential of artificial intelligence chatbots in prosthodontics education. BMC Med Educ. 2025;25(1):321.
Huang S, Wen C, Bai X, Li S, Wang S, Wang X, et al. Exploring the application capability of ChatGPT as an instructor in skills education for dental medical students: randomized controlled trial. J Med Internet Res. 2025;27:e68538.
Ali K, Barhom N, Tamimi F, Duggal M. ChatGPT—a double-edged sword for healthcare education? Implications for assessments of dental students. Eur J Dent Educ. 2024;28(1):206–11.
Weitere Informationen
Background: The integration of artificial intelligence (AI) in healthcare and medical education has advanced rapidly, with conversational AI systems gaining attention for their potential in academic assessment and clinical reasoning. This study aimed to evaluate AI chatbots' performance on restorative dentistry questions from the Turkish Dental Specialty Examination (DUS), a high-stakes national exam assessing theoretical and clinical knowledge.
Methods: An in silico, cross-sectional, comparative design was employed. A total of 190 multiple-choice questions (MCQs) from 19 DUS sessions between 2012 and 2025 were obtained from the Assessment, Selection, and Placement Center (ÖSYM) website. After excluding annulled items, 188 questions were analyzed. Eight AI chatbots (ChatGPT-3.5, ChatGPT-4o Free, ChatGPT-4o Plus, Claude Sonnet 4, Microsoft Copilot, DeepSeek, Gemini 1.5, and Gemini Advanced) were tested using a standardized single-attempt protocol in Turkish. Performance measures included accuracy, response length, and response time. Questions were categorized by year, content domain, and length for subgroup analyses. Statistical analyses were conducted in Python using standard libraries. Descriptive statistics and Pearson's correlation were calculated, while comparisons involved the Shapiro-Wilk test, Levene's test, Kruskal-Wallis test, and Dunn's post hoc test, with significance set at p < 0.05.
Results: No significant difference was found in overall accuracy (p = 0.18). However, response time and word count differed significantly (p < 0.001). Gemini Advanced showed the highest accuracy (96.28%), followed by ChatGPT-4o Plus (93.62%). Gemini 1.5 produced the longest yet fastest responses, while DeepSeek had the lowest accuracy and slowest responses. Accuracy remained stable across years but varied by topic, with lower performance in complex areas such as cavity preparation. In case-based questions, Gemini Advanced, Gemini 1.5, and ChatGPT-4o Plus achieved 100% accuracy. Performance in image-based questions was inconsistent, underscoring limitations in visual reasoning.
Conclusions: AI chatbots demonstrated high accuracy in answering restorative dentistry exam questions, with Gemini Advanced, ChatGPT-4o Plus, and Gemini 1.5 showing superior performance. Despite differences in response time and content length, their potential as supplementary tools in dental education is evident, warranting further validation across specialties and contexts.
Trial Registration: Not applicable.
(© 2026. The Author(s).)
Declarations. Ethics approval and consent to participate: Not applicable. Open-source public data was used in this study. Consent for publication: Not applicable since there was no direct human contact. Competing interests: The authors declare no competing interests.