Accelerating Clinical Text Annotation in Underrepresented Languages: A Case Study on Text De-Identification
Détails
Télécharger: 39176927.pdf (189.18 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY-NC 4.0
Etat: Public
Version: Final published version
Licence: CC BY-NC 4.0
ID Serval
serval:BIB_9D47A8BCBA47
Type
Partie de livre
Sous-type
Chapitre: chapitre ou section
Collection
Publications
Institution
Titre
Accelerating Clinical Text Annotation in Underrepresented Languages: A Case Study on Text De-Identification
Titre du livre
Digital Health and Informatics Innovations for Sustainable Health Care Systems
Editeur
IOS Press
ISBN
9781643685335
ISSN
0926-9630
1879-8365
1879-8365
ISSN-L
0926-9630
Statut éditorial
Publié
Date de publication
22/08/2024
Peer-reviewed
Oui
Volume
316
Série
Studies in Health Technology and Informatics
Pages
853-857
Langue
anglais
Résumé
Clinical notes contain valuable information for research and monitoring quality of care. Named Entity Recognition (NER) is the process for identifying relevant pieces of information such as diagnoses, treatments, side effects, etc., and bring them to a more structured form. Although recent advancements in deep learning have facilitated automated recognition, particularly in English, NER can still be challenging due to limited specialized training data. This exacerbated in hospital settings where annotations are costly to obtain without appropriate incentives and often dependent on local specificities. In this work, we study whether this annotation process can be effectively accelerated by combining two practical strategies. First, we convert usually passive annotation tasks into a proactive contest to motivate human annotators in performing a task often considered tedious and time-consuming. Second, we provide pre-annotations for the participants to evaluate how recall and precision of the pre-annotations can boost or deteriorate annotation performance. We applied both strategies to a text de-identification task on French clinical notes and discharge summaries at a large Swiss university hospital. Our results show that proactive contest and average quality pre-annotations can significantly speed up annotation time and increase annotation quality, enabling us to develop a text de-identification model for French clinical notes with high performance (F1 score 0.94).
Pubmed
Open Access
Oui
Création de la notice
30/08/2024 9:45
Dernière modification de la notice
05/09/2024 9:10