Comments (6)
We are not planning implementing it for now, but some people are indeed suggesting that pyTorch may be faster than TF. It would be great if someone can implement GA3C in pyTorch following our guidelines.
from ga3c.
I did a quick trial in one of my branches . Actually, TF is almost twice as fast, because the naive way I did the vectorized loss is probably involving a lot of function calls. The same issue arises for Chainer version. The loss takes almost more time to compute than the cnn. I think it could work faster if implementing it as a specific layer.
from ga3c.
Just FYI, my friend was able to reproduce both the speed and performance of my a3c implementation with his pytorch code.
It batches data differently from GA3C, but the overall structure is similar.
from ga3c.
interesting @ppwwyyxx !
My naive implementation gives something like this :
I am not sure if the problem is in the batching, rather than the explicit calls & many steps of computation for the loss.
p, v = self.model.forward_multistep(x_, c, h)
probs = F.softmax(p)
probs = F.relu(probs - Config.LOG_EPSILON)
log_probs = torch.log(probs)
adv = (rewards - v)
adv = torch.masked_select(adv,mask)
log_probs_a = torch.masked_select(log_probs,a) #we cannot use it because of variable length input
piloss = -torch.sum( log_probs_a * Variable(adv.data), 0)
entropy = torch.sum(torch.sum(log_probs*probs,1),0) * self.beta
vloss = torch.sum(adv.pow(2),0) / 2
loss = piloss + entropy + vloss
If someone knows how to do this more quickly in pytorch ...?
from ga3c.
@ppwwyyxx Is there a public git repo for your friend's pyTorch implementation ?
from ga3c.
Unfortunately no..
from ga3c.
Related Issues (20)
- GA3C source code has High CPU usage causing System freeze or crash HOT 6
- Playing hangs after last episode HOT 2
- Trying to compare this to universe-starter-agent (A3C) HOT 83
- There is a unused variable HOT 1
- Issues with learning in custom environment HOT 5
- how to run CPU version of A3C HOT 2
- Training on environments with long episode length HOT 3
- not use enabled?
- Cannot learn problems with a single, terminal reward HOT 4
- memory usage growth after a while HOT 4
- Cannot see the agents in action during testing
- ./_play.sh not working on OSx
- How do you make your that sess.run is not run at the same time or while in use? HOT 2
- should conduct padding before training?
- Why the ProcessAgent use Process while the ThreadTrainer use Thread? HOT 2
- Wrong A3C implementation HOT 1
- Incompatibility with the most recent releases of OpenAI Gym
- Segmentation fault HOT 6
- Meaning of the RScore
- Why the RPPS, PPS, TPS are consistently increasing HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ga3c.