Model-based RL criteria for exploration term

Re: Model-based RL criteria for exploration term

by Ariane Delrocq -
Number of replies: 0
The criteria is actually discussed in the notes for slide 13:

The exploration term should be such that it decreases withN(s,a)(actions with lowN(s,a)should be explored) and increases slowly withN(s)(ifsis visited often, we want to be really sure thatnone of the less taken actions would in fact be optimal; an increase inN(s)drives occasional re-exploration).