Diachronic Evaluation of NER Systems on Old Newspapers

Details

Ressource 1Download: 13_konvensproc.pdf (819.64 [Ko])
State: Public
Version: Author's accepted manuscript
License: Not specified
Serval ID
serval:BIB_D9C0F97FA619
Type
Inproceedings: an article in a conference proceedings.
Collection
Publications
Title
Diachronic Evaluation of NER Systems on Old Newspapers
Title of the conference
13th Conference on Natural Language Processing (KONVENS 2016), Bochum, Germany, September 19-21, 2016
Author(s)
Ehrmann Maud, Colavizza Giovanni, Rochat Yannick
Publication state
Published
Issued date
2016
Peer-reviewed
Oui
Language
english
Abstract
In recent years, many cultural institutions have engaged in large-scale newspaper digitization projects and large amounts of historical texts are being acquired (via transcription or OCRization). Beyond document preservation, the next step consists in providing an enhanced access to the con- tent of these digital resources. In this regard, the processing of units which act as referential anchors, namely named entities (NE), is of particular importance. Yet, the application of standard NE tools to historical texts faces several challenges and performances are often not as good as on con- temporary documents. This paper investigates the performances of different NE recognition tools applied on old newspapers by conducting a diachronic evaluation over 7 time-series taken from the archives of Swiss newspaper Le Temps.
Create date
18/10/2019 11:06
Last modification date
19/07/2022 8:16
Usage data