Imputation in data fusion of heterogeneous data sets a model-based numerical experiment
Détails
ID Serval
serval:BIB_42AD4B036A5C
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Imputation in data fusion of heterogeneous data sets a model-based numerical experiment
Périodique
Communications In Statistics-Simulation and Computation
ISSN
0361-0918
Statut éditorial
Publié
Date de publication
2008
Peer-reviewed
Oui
Volume
37
Numéro
7
Pages
1316-1328
Langue
anglais
Résumé
Given the very large amount of data obtained everyday through population surveys, much of the new research again could use this information instead of collecting new samples. Unfortunately, relevant data are often disseminated into different files obtained through different sampling designs. Data fusion is a set of methods used to combine information from different sources into a single dataset. In this article, we are interested in a specific problem: the fusion of two data files, one of which being quite small. We propose a model-based procedure combining a logistic regression with an Expectation-Maximization algorithm. Results show that despite the lack of data, this procedure can perform better than standard matching procedures.
Mots-clé
binary variable, data fusion, data structure, Expectation-Maximization algorithm, logistic regression, matching, MULTIPLE IMPUTATION
Web of science
Création de la notice
29/09/2009 15:51
Dernière modification de la notice
20/08/2019 13:45