Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution.

Détails

ID Serval
serval:BIB_1DA76AB4F142
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Titre
Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution.
Périodique
Statistics and computing
Auteur⸱e⸱s
Lo K., Gottardo R.
ISSN
0960-3174 (Print)
ISSN-L
0960-3174
Statut éditorial
Publié
Date de publication
01/01/2012
Peer-reviewed
Oui
Volume
22
Numéro
1
Pages
33-52
Langue
anglais
Notes
Publication types: Journal Article
Publication Status: ppublish
Résumé
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.
Pubmed
Web of science
Création de la notice
28/02/2022 11:45
Dernière modification de la notice
23/03/2024 7:24
Données d'usage