- ๐ Hi, Iโm @Eladkay
- ๐ Iโm interested in programming language theory, synthesis of software and compilation.
- ๐ฑ Iโm currently studying for my MSc. in Computer Science at Technion - IIT.
- ๐๏ธ Iโm also the TA in Charge of Introduction to Systems Programming and Reverse Engineering.
- ๐ซ You can reach me at elad (at) eladkay.com.
widowxrl's Introduction
widowxrl's People
widowxrl's Issues
Bugs and comments on WidowxSimulator
First, in the step
method, line 45
Line 45 in 96eb5c9
Should be
(self.pos[0] + change[0], self.pos[1] + change[1])
Additional issues in the simulator:
- Selecting a direction between -pi and pi is not continuous: both edges of the action space point to the same direction. Better to select a 2D action as (delta_x, delta_y) in the first place (can multiply it by same scaling factor of
step_size
if necessary) - The design choice of zeroing out actions exceeding bounds is also not consistent; clipping is better practice.
- The "first round bonus" to the reward for the first 5000 steps seems like it doesn't make sense. I see the flag for it is turned off, however - what is the idea behind it?
- Penalty for repeating actions is not necessarily a good thing - it may be required to take the same action many times to reach the target, under the current action space. Additionally, it is possible to reach the target and receive different rewards for it every time, depending on how many repetitions of the last action there were - which might hinder learning.
self.repetitions
andself.last_action
are never reset
Unrelated issues (comments relevant to the current parameters in the config file, however some may be generally relevant):
- 500K learning starts steps for 1M training steps is way too many, especially since training only happens every 20 episodes (20*75=1500 steps, so very few training steps in total compared to data)
- Your learning rate scheduler increases the learning rate instead of decreasing it; your initial learning rate is 0 (since the progress remaining parameter goes from 1 to 0). Even if you invert it, no need for the learning rate to go to 0 at the end of the run.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.