Effects of simulated observation errors on the performance of species distribution models

Détails

Ressource 1Télécharger: Fernandes_et_al-2019-Diversity_and_Distributions.pdf (1281.19 [Ko])
Etat: Serval
Version: Final published version
ID Serval
serval:BIB_B4715DCA6623
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Titre
Effects of simulated observation errors on the performance of species distribution models
Périodique
Diversity and Distributions
Auteur(s)
Fernandes R.F., Scherrer D., Guisan A.
ISSN
1472-4642
ISSN-L
1366-9516
Statut éditorial
Publié
Date de publication
2019
Peer-reviewed
Oui
Volume
25
Numéro
3
Pages
400–413
Langue
anglais
Résumé
Aim: Species distribution information is essential under increasing global changes, and models can be used to acquire such information but they can be affected by dif-ferent errors/bias. Here, we evaluated the degree to which errors in species data (false presences–absences) affect model predictions and how this is reflected in com-monly used evaluation metrics.Location: Western Swiss Alps.Methods: Using 100 virtual species and different sampling methods, we created ob-servation datasets of different sizes (100–400–1,600) and added increasing levels of errors (creating false positives or negatives; from 0% to 50%). These degraded data-sets were used to fit models using generalized linear model, random forest and boosted regression trees. Model fit (ability to reproduce calibration data) and predic-tive success (ability to predict the true distribution) were measured on probabilistic/binary outcomes using Kappa, TSS, MaxKappa, MaxTSS and Somers’D (rescaled AUC).Results: The interpretation of models’ performance depended on the data and met-rics used to evaluate them, with conclusions differing whether model fit, or predic-tive success were measured. Added errors reduced model performance, with effects expectedly decreasing as sample size increased. Model performance was more af-fected by false positives than by false negatives. Models with different techniques were differently affected by errors: models with high fit presenting lower predictive success (RFs), and vice versa (GLMs). High evaluation metrics could still be obtained with 30% error added, indicating that some metrics (Somers’D) might not be sensitive enough to detect data degradation.Main conclusions: Our findings highlight the need to reconsider the interpretation scale of some commonly used evaluation metrics: Kappa seems more realistic than Somers’D/AUC or TSS. High fits were obtained with high levels of error added, show-ing that RF overfits the data. When collecting occurrence databases, it is advisory to reduce the rate of false positives (or increase sample sizes) rather than false negatives.
Mots-clé
artificial data, AUC, ecological niche models, evaluation metric, habitat suitability models, Kappa, model fit, predictive accuracy, TSS, uncertainty
Web of science
Open Access
Oui
Création de la notice
28/09/2018 23:31
Dernière modification de la notice
09/05/2019 0:03
Données d'usage