How to evaluate community predictions without thresholding?

Scherrer, D.; Mod, H.K.; Guisan, A.

Abstract

1. Stacked species distribution models (S-SDM) provide a tool to make spatial predictions of communities, by first modelling individual species and then stacking their predictions to form assemblages. The evaluation of predictive performance is usually based on a comparison of observed and predicted community properties (e.g. species richness, composition). However, most available and used evaluation metrics require the thresholding of single species’ predicted probabilities of occurrence to obtain a binary outcome (i.e. presence/absence). This binarisation can introduce unnecessary bias and error.
2. Here, we present and demonstrate the use of several groups of new or rarely used evaluation approaches and metrics for both species richness and community composition that do not require thresholding but instead directly compare the predicted probability of occurrences of species to the presence/absence observations in the assemblages.
3. Community AUC, based on the traditional AUC, measures the ability of a model to differentiate species present or absent in a given site according to their predicted probability of occurrence. Summing the probabilities gives the expected species richness and allows estimating the probability that the observed species richness is not different from the expectation based on the species’ probabilities of occurrence. Traditional Sørensen and Jaccard similarity indices (based on presences/absences) were adapted to maxSørensen and maxJaccard and to probSørensen and probJaccard (using probabilities directly). A further approach (improvement over null-models) compared the predictions based on S-SDMs with expectations from null-models to estimate the improvement achieved in both species richness and composition predictions. Additionally, all metrics can be described against environmental conditions of sites (e.g. elevation) to highlight the ability of models to detect the variation in the strength of community assembly processes in different environments.
4. These metrics offer an unbiased view of the performance of community predictions compared to metrics requiring thresholding. As such, they allow more straightforward comparisons of model performance among studies (i.e. not influenced by any subjective thresholding decision).

SERVAL

serveur académique lausannois

How to evaluate community predictions without thresholding?

Details