How much should one sample to accurately predict the distribution of species assemblages? A virtual community approach

Fernandes, R.F.; Scherrer, D.; Guisan, A.

doi:10.1016/j.ecoinf.2018.09.002

How much should one sample to accurately predict the distribution of species assemblages? A virtual community approach

Details

Download: Fernandes_etal2018_EcolInform.pdf (1478.04 [Ko])
State: Public
Version: author
License: Not specified

Serval ID

serval:BIB_4D83F2D7A03D

Type

Article: article from journal or magazin.

Collection

Publications

Institution

UNIL/CHUV

Title

How much should one sample to accurately predict the distribution of species assemblages? A virtual community approach

Journal

Ecological Informatics

Author(s)

Fernandes R.F., Scherrer D., Guisan A.

ISSN

1878-0512

ISSN-L

1574-9541

Publication state

Published

Issued date

2018

Peer-reviewed

Oui

Volume

Pages

125-134

Language

english

Abstract

Correlative species distribution models (SDMs) are widely used to predict species distributions and assemblages, with many fundamental and applied uses. Different factors were shown to affect SDM prediction accuracy. However, real data cannot give unambiguous answers on these issues, and for this reason, artificial data have been increasingly used in recent years. Here, we move one step further by assessing how different factors can affect the prediction accuracy of virtual assemblages obtained by stacking individual SDM predictions (stacked SDMs, S-SDM). We modeled 100 virtual species in a real study area, testing five different factors: sample size (200-800-3200), sampling method (nested, non-nested), sampling prevalence (25%, 50%, 75% and species true prevalence), modelling technique (GAM, GLM, BRT and RF) and thresholding method (ROC, MaxTSS, and MaxKappa). We showed that the accuracy of S-SDM predictions is mostly affected by modelling technique followed by sample size. Models fitted by GAM/GLM had a higher accuracy and lower variance than BRT/RF. Model accuracy increased with sample size and a sampling strategy reflecting the true prevalence of the species was most successful. However, even with sample sizes as high as >3000 sites, residual uncertainty remained in the predictions, potentially reflecting a bias introduced by creating and/or resampling the virtual species. Therefore, when evaluating the accuracy of predictions from S-SDMs fitted with real field data, one can hardly expect reaching perfect accuracy, and reasonably high values of similarity or predictive success can already be seen as valuable predictions. We recommend the use of a ‘plot-like’ sampling method (best approximation of the species’ true prevalence) and not simply increasing the number of presences-absences of species. As presented here, virtual simulations might be used more systematically in future studies to inform about the best accuracy level that one could expect given the characteristics of the data and the methods used to fit and stack SDMs.

Keywords

Virtual community ecologist, stacked species distribution models, nested design, factors importance, relative effects, sample size, sampling effect

URN

urn:nbn:ch:serval-BIB_4D83F2D7A03D4

OAI-PMH

oai:serval.unil.ch:BIB_4D83F2D7A03D

DOI

10.1016/j.ecoinf.2018.09.002

Web of science

000453641900013

Create date

03/09/2018 8:19