Assessing the Performance of Random Forests for Modeling Claim Severity in Collision Car Insurance

Staudt, Y.; Wagner, J.

doi:10.3390/risks9030053

Assessing the Performance of Random Forests for Modeling Claim Severity in Collision Car Insurance

Détails

Télécharger: risks-09-00053.pdf (15390.56 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY 4.0

ID Serval

serval:BIB_1FCA98100E54

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

UNIL/CHUV

Titre

Assessing the Performance of Random Forests for Modeling Claim Severity in Collision Car Insurance

Périodique

Risks

Auteur⸱e⸱s

Staudt Y., Wagner J.

Statut éditorial

Publié

Date de publication

2021

Peer-reviewed

Oui

Volume

Numéro

Langue

anglais

Résumé

For calculating non-life insurance premiums actuaries traditionally rely on separate severity and frequency models using covariates to explain the claims loss exposure. In this paper, we focus on the claim severity. First, we build two reference models, a generalized linear model and a generalized additive model, relying on a log-normal distribution of the severity and including the most significant factors. Thereby, we follow Henckaerts et al. (2018) to relate the continuous variables to the response in a nonlinear way. In the second step, we tune two random forest models, one for the claim severity and one for the log-transformed claim severity where the latter requires a transformation of the predicted results. We compare the prediction performance of the different models using the relative error, the root mean squared error and the goodness-of-lift statistics in combination with goodness-of-fit statistics (Denuit et al. 2019). In our application, we rely on a dataset of a Swiss collision insurance portfolio covering the loss exposure of the period from 2011 to 2015 and including observations from 81 309 settled claims with a total amount of CHF 184 mio. In the analysis, we use the data from 2011 to 2014 for training and from 2015 for testing. Our results indicate that the use of a log-normal transformation of the severity is not leading to performance gains with random forests. However, random forests with a log-normal transformation are the favorite choice to explain right-skewed claims. Finally, when considering all indicators, we conclude that the generalized additive model has the best overall performance.

Mots-clé

regression model, data-driven binning, random forest, performance analysis, severity modeling

URN

urn:nbn:ch:serval-BIB_1FCA98100E543

OAI-PMH

oai:serval.unil.ch:BIB_1FCA98100E54

DOI

10.3390/risks9030053

Open Access

Oui

Création de la notice

09/03/2021 15:47

Dernière modification de la notice

17/03/2021 7:08

Données d'usage

SERVAL

serveur académique lausannois

Assessing the Performance of Random Forests for Modeling Claim Severity in Collision Car Insurance

Détails