Too many candidates: Embedded covariate selection procedure for species distribution modelling with the covsel R package
Details
Serval ID
serval:BIB_0C77A008BDD2
Type
Article: article from journal or magazin.
Collection
Publications
Institution
Title
Too many candidates: Embedded covariate selection procedure for species distribution modelling with the covsel R package
Journal
Ecological Informatics
ISSN
1574-9541
Publication state
Published
Issued date
2023
Peer-reviewed
Oui
Volume
75
Pages
102080
Language
english
Abstract
1. Selecting the best subset of covariates out of a panel of many candidates is a key and highly influential stage of the species distribution modelling process. Yet, there is currently no commonly accepted and widely adopted standard approach by which to perform this selection.
2. We introduce a two-step “embedded” covariate selection procedure aimed at optimizing the predictive ability and parsimony of species distribution models fitted in a context of high-dimensional candidate covariate space. The procedure combines a collinearity-filtering algorithm (Step A) with three model-specific embedded regularization techniques (Step B), including generalized linear model with elastic net regularization, generalized additive model with null-space penalization, and guided regularized random forest.
3. We evaluated the embedded covariate selection procedure through an example application aimed at modelling the habitat suitability of 50 species in Switzerland from a suite of 123 candidate covariates. We demonstrated the ability of the embedded covariate selection procedure to provide significantly more accurate species distribution models as compared to models obtained with alternative procedures. Model performance was independent of the characteristics of the species data, such as the number of occurrence records or their spatial distribution across the study area.
4. We implemented and streamlined our embedded covariate selection procedure in the covsel R package, paving the way for a ready-to-use, automated, covariate selection tool that was missing in the field of species distribution modelling. All the information required for installing and running the covsel R package is openly available on the GitHub repository https://github.com/N-SDM/covsel.
2. We introduce a two-step “embedded” covariate selection procedure aimed at optimizing the predictive ability and parsimony of species distribution models fitted in a context of high-dimensional candidate covariate space. The procedure combines a collinearity-filtering algorithm (Step A) with three model-specific embedded regularization techniques (Step B), including generalized linear model with elastic net regularization, generalized additive model with null-space penalization, and guided regularized random forest.
3. We evaluated the embedded covariate selection procedure through an example application aimed at modelling the habitat suitability of 50 species in Switzerland from a suite of 123 candidate covariates. We demonstrated the ability of the embedded covariate selection procedure to provide significantly more accurate species distribution models as compared to models obtained with alternative procedures. Model performance was independent of the characteristics of the species data, such as the number of occurrence records or their spatial distribution across the study area.
4. We implemented and streamlined our embedded covariate selection procedure in the covsel R package, paving the way for a ready-to-use, automated, covariate selection tool that was missing in the field of species distribution modelling. All the information required for installing and running the covsel R package is openly available on the GitHub repository https://github.com/N-SDM/covsel.
Publisher's website
Open Access
Yes
Create date
25/04/2023 8:25
Last modification date
27/04/2023 5:55