CS-456: Functions in the Project description

Hello,

In the project description, it is said we need to create an Agent Class with the following functions:

- observe(self, state, action, next_state, reward) : called upon observing a new transition of the environment.

- select_action(self, state) : pick an action from the given state.
- update(self) : called after each environment step. This is where all the training takes place.

1. I am not sure how observe function should work. If we already need to pass state, action, next_state, and reward, what does it even observe?
2. Should we create different classes for different algorithms (DQN, Dyna, and Random) as the update and select_action functions are to be implemented differently for each one of them?

Thank you.

Re: Functions in the Project description

by Lucas Louis Gruaz - Monday, 6 May 2024, 11:45

Hello,
1. The idea is that the transition (state, action, next_state, reward) will be generated by the environment, and the observe function defines what the agent does upon observing the transition.
2. Yes, you should create different classes.

I hope it's clear, tell me otherwise.