Light

eladkay / widowxrl Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 1.06 MB

Python 100.00%

widowxrl's Introduction

👋 Hi, I’m @Eladkay
👀 I’m interested in programming language theory, synthesis of software and compilation.
🌱 I’m currently studying for my MSc. in Computer Science at Technion - IIT.
💞️ I’m also the TA in Charge of Introduction to Systems Programming and Reverse Engineering.
📫 You can reach me at elad (at) eladkay.com.

widowxrl's People

Contributors

Watchers

widowxrl's Issues

Bugs and comments on WidowxSimulator

First, in the step method, line 45

WidowXRL/widowx_simulator.py

Line 45 in 96eb5c9

self.pos = (self.pos[0] + change[0], self.pos[0] + change[1])

Should be (self.pos[0] + change[0], self.pos[1] + change[1])

Additional issues in the simulator:

Selecting a direction between -pi and pi is not continuous: both edges of the action space point to the same direction. Better to select a 2D action as (delta_x, delta_y) in the first place (can multiply it by same scaling factor of step_size if necessary)
The design choice of zeroing out actions exceeding bounds is also not consistent; clipping is better practice.
The "first round bonus" to the reward for the first 5000 steps seems like it doesn't make sense. I see the flag for it is turned off, however - what is the idea behind it?
Penalty for repeating actions is not necessarily a good thing - it may be required to take the same action many times to reach the target, under the current action space. Additionally, it is possible to reach the target and receive different rewards for it every time, depending on how many repetitions of the last action there were - which might hinder learning.
self.repetitions and self.last_action are never reset

Unrelated issues (comments relevant to the current parameters in the config file, however some may be generally relevant):

500K learning starts steps for 1M training steps is way too many, especially since training only happens every 20 episodes (20*75=1500 steps, so very few training steps in total compared to data)
Your learning rate scheduler increases the learning rate instead of decreasing it; your initial learning rate is 0 (since the progress remaining parameter goes from 1 to 0). Even if you invert it, no need for the learning rate to go to 0 at the end of the run.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs