On the robust measurement of inflectional diversity

Details

Ressource 1Download: qualico_14_proc_ax_gg_preprint.pdf (224.05 [Ko])
State: Public
Version: author
License: Not specified
Serval ID
serval:BIB_EF5F6419C909
Type
A part of a book
Publication sub-type
Chapter: chapter ou part
Collection
Publications
Institution
Title
On the robust measurement of inflectional diversity
Title of the book
Recent Contributions to Quantitative Linguistics
Author(s)
Xanthos A., Guex G.
Publisher
De Gruyter
Address of publication
Berlin
ISBN
9783110420296 (Online)
9783110419870 (Print)
Publication state
Published
Issued date
2015
Editor
Tuzzi A., Benesova M., Macutek J.
Volume
70
Series
Quantitative Linguistics
Pages
241-254
Language
english
Abstract
Lexical diversity measures are notoriously sensitive to variations of sample size and recent approaches to this issue typically involve the computation of the average variety of lexical units in random subsamples of fixed size. This methodology has been further extended to measures of inflectional diversity such as the average number of wordforms per lexeme, also known as the mean size of paradigm (MSP) index.
In this contribution we argue that, while random sampling can indeed be used to increase the robustness of inflectional diversity measures, using a fixed subsample size is only justified under the hypothesis that the corpora that we compare have the same degree of lexematic diversity. In the more general case where they may have differing degrees of lexematic diversity, a more sophisticated strategy can and should be adopted.
A novel approach to the measurement of inflectional diversity is proposed, aiming to cope not only with variations of sample size, but also with variations of lexematic diversity. The robustness of this new method is empirically assessed and the results show that while there is still room for improvement, the proposed methodology considerably attenuates the impact of lexematic diversity discrepancies on the measurement of inflectional diversity.
Keywords
inflectional diversity, mean size of paradigm, MSP, RMSP, lexical diversity, robustness, random sampling
Publisher's website
Create date
25/11/2015 15:53
Last modification date
25/07/2020 6:10
Usage data