Effects of simulated observation errors on the performance of species distribution models

Fernandes, R.F.; Scherrer, D.; Guisan, A.

doi:10.1111/ddi.12868

Effects of simulated observation errors on the performance of species distribution models

Détails

Télécharger: Fernandes_et_al-2019-Diversity_and_Distributions.pdf (1281.19 [Ko])
Etat: Public
Version: Final published version
Licence: Non spécifiée

ID Serval

serval:BIB_B4715DCA6623

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

UNIL/CHUV

Titre

Effects of simulated observation errors on the performance of species distribution models

Périodique

Diversity and Distributions

Auteur⸱e⸱s

Fernandes R.F., Scherrer D., Guisan A.

ISSN

1472-4642

ISSN-L

1366-9516

Statut éditorial

Publié

Date de publication

2019

Peer-reviewed

Oui

Volume

Numéro

Pages

400–413

Langue

anglais

Résumé

Aim: Species distribution information is essential under increasing global changes, and models can be used to acquire such information but they can be affected by dif-ferent errors/bias. Here, we evaluated the degree to which errors in species data (false presences–absences) affect model predictions and how this is reflected in com-monly used evaluation metrics.Location: Western Swiss Alps.Methods: Using 100 virtual species and different sampling methods, we created ob-servation datasets of different sizes (100–400–1,600) and added increasing levels of errors (creating false positives or negatives; from 0% to 50%). These degraded data-sets were used to fit models using generalized linear model, random forest and boosted regression trees. Model fit (ability to reproduce calibration data) and predic-tive success (ability to predict the true distribution) were measured on probabilistic/binary outcomes using Kappa, TSS, MaxKappa, MaxTSS and Somers’D (rescaled AUC).Results: The interpretation of models’ performance depended on the data and met-rics used to evaluate them, with conclusions differing whether model fit, or predic-tive success were measured. Added errors reduced model performance, with effects expectedly decreasing as sample size increased. Model performance was more af-fected by false positives than by false negatives. Models with different techniques were differently affected by errors: models with high fit presenting lower predictive success (RFs), and vice versa (GLMs). High evaluation metrics could still be obtained with 30% error added, indicating that some metrics (Somers’D) might not be sensitive enough to detect data degradation.Main conclusions: Our findings highlight the need to reconsider the interpretation scale of some commonly used evaluation metrics: Kappa seems more realistic than Somers’D/AUC or TSS. High fits were obtained with high levels of error added, show-ing that RF overfits the data. When collecting occurrence databases, it is advisory to reduce the rate of false positives (or increase sample sizes) rather than false negatives.

Mots-clé

artificial data, AUC, ecological niche models, evaluation metric, habitat suitability models, Kappa, model fit, predictive accuracy, TSS, uncertainty

URN

urn:nbn:ch:serval-BIB_B4715DCA66231

OAI-PMH

oai:serval.unil.ch:BIB_B4715DCA6623

DOI

10.1111/ddi.12868

Web of science

000458429600006

Open Access

Oui