ChatGPT Generated Otorhinolaryngology Multiple-Choice Questions: Quality, Psychometric Properties, and Suitability for Assessments.

Lotto, C.; Sheppard, S.C.; Anschuetz, W.; Stricker, D.; Molinari, G.; Huwendiek, S.; Anschuetz, L.

doi:10.1002/oto2.70018

ChatGPT Generated Otorhinolaryngology Multiple-Choice Questions: Quality, Psychometric Properties, and Suitability for Assessments.

Détails

Demande d'une copie

ID Serval

serval:BIB_57D87EE1F6C3

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

UNIL/CHUV

Titre

ChatGPT Generated Otorhinolaryngology Multiple-Choice Questions: Quality, Psychometric Properties, and Suitability for Assessments.

Périodique

OTO open

Auteur⸱e⸱s

Lotto C., Sheppard S.C., Anschuetz W., Stricker D., Molinari G., Huwendiek S., Anschuetz L.

ISSN

2473-974X (Electronic)

ISSN-L

2473-974X

Statut éditorial

Publié

Date de publication

2024

Peer-reviewed

Oui

Volume

Numéro

Pages

e70018

Langue

anglais

Notes

Publication types: Journal Article
Publication Status: epublish

Résumé

To explore Chat Generative Pretrained Transformer's (ChatGPT's) capability to create multiple-choice questions about otorhinolaryngology (ORL).
Experimental question generation and exam simulation.
Tertiary academic center.
ChatGPT 3.5 was prompted: "Can you please create a challenging 20-question multiple-choice questionnaire about clinical cases in otolaryngology, offering five answer options?." The generated questionnaire was sent to medical students, residents, and consultants. Questions were investigated regarding quality criteria. Answers were anonymized and the resulting data was analyzed in terms of difficulty and internal consistency.
ChatGPT 3.5 generated 20 exam questions of which 1 question was considered off-topic, 3 questions had a false answer, and 3 questions had multiple correct answers. Subspecialty theme repartition was as follows: 5 questions were on otology, 5 about rhinology, and 10 questions addressed head and neck. The qualities of focus and relevance were good while the vignette and distractor qualities were low. The level of difficulty was suitable for undergraduate medical students (n = 24), but too easy for residents (n = 30) or consultants (n = 10) in ORL. Cronbach's α was highest (.69) with 15 selected questions using students' results.
ChatGPT 3.5 is able to generate grammatically correct simple ORL multiple choice questions for a medical student level. However, the overall quality of the questions was average, needing thorough review and revision by a medical expert to ensure suitability in future exams.

Mots-clé

ChatGPT, artificial intelligence, exam, large language model, multiple choice question, otolaryngology

OAI-PMH

oai:serval.unil.ch:BIB_57D87EE1F6C3

DOI

10.1002/oto2.70018

Pubmed

39328276

Web of science

001320068300001

Open Access

Oui