Anatomy exam model for the circulatory and respiratory systems using GPT-4: a medical school study

dc.contributor.authorTekin, Ayla
dc.contributor.authorKaramus, Nizameddin Fatih
dc.contributor.authorÇolak, Tuncay
dc.date.accessioned2025-08-14T17:36:34Z
dc.date.available2025-08-14T17:36:34Z
dc.date.issued2025
dc.departmentFakülteler, Tıp Fakültesi, Temel Tıp Bilimleri, Anatomi Ana Bilim Dalı
dc.descriptionArticle Number : 158
dc.description.abstractPurpose: The study aimed to evaluate the effectiveness of anatomy multiple-choice questions (MCQs) generated by GPT-4, focused on their methodological appropriateness and alignment with the cognitive levels defined by Bloom's revised taxonomy to enhance assessment. Methods: The assessment questions developed for medical students were created utilizing GPT-4, comprising 240 MCQs organized into subcategories consistent with Bloom's revised taxonomy. When designing prompts to create MCQs, details about the lesson's purpose, learning objectives, and students' prior experiences were included to ensure the questions were contextually appropriate. A set of 30 MCQs was randomly selected from the generated questions for testing. A total of 280 students participated in the examination, which assessed the difficulty index of the MCQs, the item discrimination index, and the overall test difficulty level. Expert anatomists examined the taxonomy accuracy of GPT-4's questions. Results: Students achieved a median score of 50 (range, 36.67-60) points on the test. The test's internal consistency, assessed by KR-20, was 0.737. The average difficulty of the test was 0.5012. Results show difficulty and discrimination indices for each AI-generated question. Expert anatomists' taxonomy-based classifications matched GPT-4's 26.6%. Meanwhile, 80.9% of students found the questions were clear, and 85.8% showed interest in retaking the assessment exam. Conclusion: This study demonstrates GPT-4's significant potential for generating medical education exam questions. While it effectively assesses basic knowledge recall, it fails to sufficiently evaluate higher-order cognitive processes outlined in Bloom's revised taxonomy. Future research should consider alternative methods that combine AI with expert evaluation and specialized multimodal models.
dc.identifier.citationTekin, A., Karamus, N. F., & Çolak, T. (2025). Anatomy exam model for the circulatory and respiratory systems using GPT-4: a medical school study. Surgical and Radiologic Anatomy, 47(1), 158. 10.1007/s00276-025-03667-z
dc.identifier.doi10.1007/s00276-025-03667-z
dc.identifier.issn0930-1038
dc.identifier.issn1279-8517
dc.identifier.issue1
dc.identifier.pmid40495075
dc.identifier.scopus2-s2.0-105007648593
dc.identifier.scopusqualityQ2
dc.identifier.urihttps://hdl.handle.net/20.500.12939/5896
dc.identifier.volume47
dc.identifier.wosWOS:001506184300001
dc.identifier.wosqualityQ3
dc.indekslendigikaynakPubMed
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.institutionauthorKaramus, Nizameddin Fatih
dc.language.isoen
dc.publisherSpringer International
dc.relation.ispartofJournal of Clinical Anatomy
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectAnatomy assessment
dc.subjectBloom's revised taxonomy
dc.subjectGPT-4
dc.subjectMultiple-choice questions
dc.titleAnatomy exam model for the circulatory and respiratory systems using GPT-4: a medical school study
dc.typeArticle

Dosyalar

Lisans paketi
Listeleniyor 1 - 1 / 1
[ X ]
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: