Model-based RL criteria for exploration term

Re: Model-based RL criteria for exploration term

par Ariane Delrocq,
Nombre de réponses : 0
The criteria is actually discussed in the notes for slide 13:

The exploration term should be such that it decreases withN(s,a)(actions with lowN(s,a)should be explored) and increases slowly withN(s)(ifsis visited often, we want to be really sure thatnone of the less taken actions would in fact be optimal; an increase inN(s)drives occasional re-exploration).