This is the first project in the Udacity Deep Reinforcement Learning Nanodegree. It requires students to develop and train a Deep Q-Network (DQN) model to collect yellow bananas in a simulator.
This project is the first to be solved in the Udacity Deep Reinforcement Learning Nanodegree. It is based in an implementation of a Deep Q-Network (DQN) model. The environment is modeled in Unity, and the task is to train an agent to collect yellow bananas (getting a reward of +1) and avoid non yellow ones (getting -1 reward) c:
The simulation contains a single agent that navigates a large environment. At each time step, it has four actions at its disposal:
0
- walk forward1
- walk backward2
- turn left3
- turn right
The state space has 37
dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. A reward of +1
is provided for collecting a yellow banana, and a reward of -1
is provided for collecting a blue banana.
The environment is considered solved when the average reward (over the last 100 episodes) is at least +13.
See the instrucions here to set up your environment instructions here
It also requires Unity ML-Agents, NumPy and PyTorch
Get the environment matching your OS :
Linux: click here Mac OSX: click here Windows (32-bit): click here Windows (64-bit): click here
Use full path file reference for such environment. Note that Banana.app is already included in this repo, so it can be imported with:
env = UnityEnvironment(file_name="Banana.app")
Then run the navigation_banana.ipynb
notebook using the drlnd kernel to train the DQN agent.
After trainig the model, parameters will be dumpt to checkpoint.pth
and will be used by the trained agent.