Effects of simulated observation errors on the performance of species distribution models

Fernandes, R.F.; Scherrer, D.; Guisan, A.

doi:10.1111/ddi.12868

Effects of simulated observation errors on the performance of species distribution models

Details

Download: Fernandes_et_al-2019-Diversity_and_Distributions.pdf (1281.19 [Ko])
State: Public
Version: Final published version
License: Not specified

Serval ID

serval:BIB_B4715DCA6623

Type

Article: article from journal or magazin.

Collection

Publications

Institution

UNIL/CHUV

Title

Effects of simulated observation errors on the performance of species distribution models

Journal

Diversity and Distributions

Author(s)

Fernandes R.F., Scherrer D., Guisan A.

ISSN

1472-4642

ISSN-L

1366-9516

Publication state

Published

Issued date

2019

Peer-reviewed

Oui

Volume

Number

Pages

400–413

Language

english

Abstract

Aim: Species distribution information is essential under increasing global changes, and models can be used to acquire such information but they can be affected by dif-ferent errors/bias. Here, we evaluated the degree to which errors in species data (false presences–absences) affect model predictions and how this is reflected in com-monly used evaluation metrics.Location: Western Swiss Alps.Methods: Using 100 virtual species and different sampling methods, we created ob-servation datasets of different sizes (100–400–1,600) and added increasing levels of errors (creating false positives or negatives; from 0% to 50%). These degraded data-sets were used to fit models using generalized linear model, random forest and boosted regression trees. Model fit (ability to reproduce calibration data) and predic-tive success (ability to predict the true distribution) were measured on probabilistic/binary outcomes using Kappa, TSS, MaxKappa, MaxTSS and Somers’D (rescaled AUC).Results: The interpretation of models’ performance depended on the data and met-rics used to evaluate them, with conclusions differing whether model fit, or predic-tive success were measured. Added errors reduced model performance, with effects expectedly decreasing as sample size increased. Model performance was more af-fected by false positives than by false negatives. Models with different techniques were differently affected by errors: models with high fit presenting lower predictive success (RFs), and vice versa (GLMs). High evaluation metrics could still be obtained with 30% error added, indicating that some metrics (Somers’D) might not be sensitive enough to detect data degradation.Main conclusions: Our findings highlight the need to reconsider the interpretation scale of some commonly used evaluation metrics: Kappa seems more realistic than Somers’D/AUC or TSS. High fits were obtained with high levels of error added, show-ing that RF overfits the data. When collecting occurrence databases, it is advisory to reduce the rate of false positives (or increase sample sizes) rather than false negatives.

Keywords

artificial data, AUC, ecological niche models, evaluation metric, habitat suitability models, Kappa, model fit, predictive accuracy, TSS, uncertainty

URN

urn:nbn:ch:serval-BIB_B4715DCA66231

OAI-PMH

oai:serval.unil.ch:BIB_B4715DCA6623

DOI

10.1111/ddi.12868

Web of science

000458429600006

Open Access

Yes