A model for the evolution of reinforcement learning in fluctuating games

Dridi, S.; Lehmann, L.

doi:10.1016/j.anbehav.2015.01.037

A model for the evolution of reinforcement learning in fluctuating games

Détails

Télécharger: 15RLvsBL.pdf (1757.81 [Ko])
Etat: Public
Version: de l'auteur⸱e

ID Serval

serval:BIB_AC26C188AEA1

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

UNIL/CHUV

Titre

A model for the evolution of reinforcement learning in fluctuating games

Périodique

Animal Behaviour

Auteur⸱e⸱s

Dridi S., Lehmann L.

ISSN

1095-8282 (ISSN)

ISSN-L

0003-3472

Statut éditorial

Publié

Date de publication

2015

Volume

104

Pages

87-114

Langue

anglais

Résumé

Many species are able to learn to associate behaviours with rewards as this gives fitness advantages in changing environments. Social interactions between population members may, however, require more cognitive abilities than simple trial-and-error learning, in particular the capacity to make accurate hypotheses about the material payoff consequences of alternative action combinations. It is unclear in this context whether natural selection necessarily favours individuals to use information about payoffs associated with nontried actions (hypothetical payoffs), as opposed to simple reinforcement of realized payoff. Here, we develop an evolutionary model in which individuals are genetically determined to use either trial-and-error learning or learning based on hypothetical reinforcements, and ask what is the evolutionarily stable learning rule under pairwise symmetric two-action stochastic repeated games played over the individual's lifetime. We analyse through stochastic approximation theory and simulations the learning dynamics on the behavioural timescale, and derive conditions where trial-and-error learning outcompetes hypothetical reinforcement learning on the evolutionary timescale. This occurs in particular under repeated cooperative interactions with the same partner. By contrast, we find that hypothetical reinforcement learners tend to be favoured under random interactions, but stable polymorphisms can also obtain where trial-and-error learners are maintained at a low frequency. We conclude that specific game structures can select for trial-and-error learning even in the absence of costs of cognition, which illustrates that cost-free increased cognition can be counterselected under social interactions.

Mots-clé

evolution of cognition, evolutionarily stable learning rules, exploration-exploitation trade-off, repeated games, social interactions, trial-and-error learning

URN

urn:nbn:ch:serval-BIB_AC26C188AEA18

OAI-PMH

oai:serval.unil.ch:BIB_AC26C188AEA1

DOI

10.1016/j.anbehav.2015.01.037

Web of science

000354811800014

Open Access

Oui

Création de la notice

06/11/2017 11:39