Comments (20)
what os you running this on? I just ran it works fine for me
[2017-08-01 11:30:30,111] Making new env: Pong-v0
[2017-08-01 11:30:30,336] Clearing 6 monitor files from previous run (because force=True was provided)
[2017-08-01 11:30:30,345] Starting new video recorder writing to /Users/dgriffis/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.38559.video000000.mp4
2017-08-01 11:30:42,785 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:30:42,785] reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:30:42,804] Starting new video recorder writing to /Users/dgriffis/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.38559.video000001.mp4
2017-08-01 11:30:55,304 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:30:55,304] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:07,255 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:07,255] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:19,209 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:19,209] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:31,044 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:31,044] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:43,474 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:43,474] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:31:55,597 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:31:55,597] reward sum: 21.0, reward mean: 21.0000
2017-08-01 11:32:07,620 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:32:07,620] reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:32:07,628] Starting new video recorder writing to /Users/dgriffis/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.38559.video000008.mp4
2017-08-01 11:32:20,379 : reward sum: 21.0, reward mean: 21.0000
[2017-08-01 11:32:20,379] reward sum: 21.0, reward mean: 21.0000
from rl_a3c_pytorch.
I am running macOS, why am I just got -21 all the time. Which command are you using?
from rl_a3c_pytorch.
β rl_a3c_pytorch git:(master) β python gym_eval.py --env Pong-v0 --num-episodes 100
[2017-08-02 09:06:09,852] Making new env: Pong-v0
[2017-08-02 09:06:10,107] Clearing 6 monitor files from previous run (because force=True was provided)
[2017-08-02 09:06:10,145] Starting new video recorder writing to /Volumes/xs/CodeSpace/AISpace/rl_space/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.35879.video000000.mp4
2017-08-02 09:06:20,499 : reward sum: -21.0, reward mean: -21.0000
[2017-08-02 09:06:20,499] reward sum: -21.0, reward mean: -21.0000
[2017-08-02 09:06:20,529] Starting new video recorder writing to /Volumes/xs/CodeSpace/AISpace/rl_space/rl_a3c_pytorch/Pong-v0_monitor/openaigym.video.0.35879.video000001.mp4
2017-08-02 09:06:30,942 : reward sum: -21.0, reward mean: -21.0000
[2017-08-02 09:06:30,942] reward sum: -21.0, reward mean: -21.0000
I trained whole night but when I cut it, nothing saved, can not find any model saved.....
from rl_a3c_pytorch.
Well first I would update repo cause I tinkered a lot with it past couple days but I know it's working fine now.. are you seeing the models in the trained_models folder?
from rl_a3c_pytorch.
Oh this your trained model? Are you seeing a saved model in the folder or and models? Should be a Pong-v0.dat file
from rl_a3c_pytorch.
Yeah, I seen it, but seems this model is you have trained already in your repo, cause besides Pong there are other moels. Anyway, how should I exactly call my model and the render env at mean time to see AI play?
from rl_a3c_pytorch.
Well I have set up up so models save in trained models folder and load there. If you want to watch gym_eval you have to do
Python gym_eval.py --env Pong-v0 --num-episodes 100 --render True
from rl_a3c_pytorch.
Well, got this old-fasioned black-white screen, and the result still -21, is the model didn't update?
from rl_a3c_pytorch.
That looks like you having dependencies issues with gym
Can u go in terminal: start python
Type:
Import gym
Import cv2
env=gym.make('Pong-v0')
frame=env.reset()
cv2.imshow('tt', frame)
cv2.waitKey(0)
Let me know what you see from that..
from rl_a3c_pytorch.
Well, weired, I also have gym on python3, it work totally fine. on Python2.7 it shows like this, no matter using cv2 or just env.render(), should I update this code to python3?
from rl_a3c_pytorch.
It has problem in save model, I update main.py default saved dir to trained_models_me, when I cut it, there has no my dir created.
from rl_a3c_pytorch.
You have to create directory first if not using folder trained_models. I did not set up to create saved folder directories
from rl_a3c_pytorch.
Yeah try that same code in python3 and see if pic of Atari screen comes up
from rl_a3c_pytorch.
Thanks dgriff, you are a master in reinforcement learning.
from rl_a3c_pytorch.
It's working now?! You welcome. Happy to helpπ
from rl_a3c_pytorch.
Yeah, really thanks your help Pal.
from rl_a3c_pytorch.
Awesome! Have fun π
from rl_a3c_pytorch.
Hi, dgriff, sorry for the bother but I have one last question, in train.py I can't find codes to save model, I am new to pytorch, is there a way to store weights into specific dir and load it when run again?
from rl_a3c_pytorch.
When running training command you can do:
python main.py --env Pong-v0 --workers 32 --save-dir 'example_folder/'
And to load from specific folder:
python main.py --env Pong-v0 --workers 32 --load-dir 'example_folder/' --load True
Can also specify both in command
from rl_a3c_pytorch.
Loading code in training is in main.py
if args.load:
saved_state = torch.load(
'{0}{1}.dat'.format(args.load_model_dir, args.env))
Saving model code is in test.py
if reward_sum > args.save_score_level:
player.model.load_state_dict(shared_model.state_dict())
state_to_save = player.model.state_dict()
torch.save(state_to_save, '{0}{1}.dat'.format(
args.save_model_dir, args.env))
And load model code in gym_eval.py
saved_state = torch.load(
'{0}{1}.dat'.format(args.load_model_dir, args.env),
map_location=lambda storage, loc: storage)
from rl_a3c_pytorch.
Related Issues (20)
- Why ensure_shared_grads HOT 1
- NotImplementedError HOT 1
- Pretrained models HOT 6
- Quick question on batch processing HOT 4
- eps for Adam HOT 2
- plot rewards as a function of number of timesteps HOT 1
- Stuck when training in MsPacman-v0 HOT 7
- Reward Smoothing HOT 2
- Need for trained models HOT 2
- Cannot import test HOT 1
- Question about Test function HOT 1
- Is there any necessary to lock when update params? HOT 1
- How can I let the training automatically stop after a given number of episodes or after a given period of time? HOT 1
- Clarification needed regarding num_workers HOT 2
- question about trained models HOT 1
- UserWarning: This overload of add_ is deprecated
- Need a model, thank you HOT 1
- The links to the Gym environment evaluations is 404.
- run a3c on 8 cpus, it still slow. HOT 1
- Hyperparameters for training
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rl_a3c_pytorch.