GithubHelp home page GithubHelp logo

widowxrl's Introduction

  • ๐Ÿ‘‹ Hi, Iโ€™m @Eladkay
  • ๐Ÿ‘€ Iโ€™m interested in programming language theory, synthesis of software and compilation.
  • ๐ŸŒฑ Iโ€™m currently studying for my MSc. in Computer Science at Technion - IIT.
  • ๐Ÿ’ž๏ธ Iโ€™m also the TA in Charge of Introduction to Systems Programming and Reverse Engineering.
  • ๐Ÿ“ซ You can reach me at elad (at) eladkay.com.

widowxrl's People

Contributors

eitanbloch avatar eladkay avatar

Watchers

 avatar  avatar

widowxrl's Issues

Bugs and comments on WidowxSimulator

First, in the step method, line 45

self.pos = (self.pos[0] + change[0], self.pos[0] + change[1])

Should be (self.pos[0] + change[0], self.pos[1] + change[1])

Additional issues in the simulator:

  • Selecting a direction between -pi and pi is not continuous: both edges of the action space point to the same direction. Better to select a 2D action as (delta_x, delta_y) in the first place (can multiply it by same scaling factor of step_size if necessary)
  • The design choice of zeroing out actions exceeding bounds is also not consistent; clipping is better practice.
  • The "first round bonus" to the reward for the first 5000 steps seems like it doesn't make sense. I see the flag for it is turned off, however - what is the idea behind it?
  • Penalty for repeating actions is not necessarily a good thing - it may be required to take the same action many times to reach the target, under the current action space. Additionally, it is possible to reach the target and receive different rewards for it every time, depending on how many repetitions of the last action there were - which might hinder learning.
  • self.repetitions and self.last_action are never reset

Unrelated issues (comments relevant to the current parameters in the config file, however some may be generally relevant):

  • 500K learning starts steps for 1M training steps is way too many, especially since training only happens every 20 episodes (20*75=1500 steps, so very few training steps in total compared to data)
  • Your learning rate scheduler increases the learning rate instead of decreasing it; your initial learning rate is 0 (since the progress remaining parameter goes from 1 to 0). Even if you invert it, no need for the learning rate to go to 0 at the end of the run.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.