GithubHelp home page GithubHelp logo

rlaspa's Introduction

Hello there ๐Ÿ‘‹ I hope you like what you find here ๐Ÿค“

class Researcher(CuriousHuman):

    def __init__(self):
        self.name = "Tonio Weidler"
        self.role = "PhD Candidate"
        self.affiliation = "Maastricht University"
        self.residence = "The Netherlands"

        self.code = [
            "Python",
            "JavaScript",
            "Java",
            "PHP"
        ]

        self.research_field = "Neuroscience"
        self.research_topics = [
            "Sensorimotor Control",
            "Human Dexterity",
            "Goal-Driven Models",
            "Deep Learning"
        ] 

        self.language_spoken = ["de_DE", "en_US"]

    def greet(self, name):
        print(f"Hello there, {name}! I hope you like what you find here :)")

rlaspa's People

Contributors

adrigrillo avatar alessandroscoppio avatar dannigt avatar hansbambel avatar weidler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rlaspa's Issues

Exploration strategy

I have changed the exploration strategy to something similar of what we have talked. Now, when you instantiate an agent the following parameters will can be configured:

  • init_eps: Initial epsilon. Default:1.0.
  • min_eps: Minimal epsilon. Default: 0.01.
  • eps_decay: Number of steps for epsilon convergence to the minimal value since the use of the memory. Default: 500
  • per_init_eps_memory: percentage of the initial epsilon that will remain when the memory starts to be used. Default: 0.8

So, for now, when the agent is not using the memory a linear decay is used that will go from the init_eps to init_eps * per_init_eps_memory. In the default case from 1.0 to 0.8.
Then, when the memory is used the exponential decay will be used. It will go from init_eps * per_init_eps_memory to min_eps, having as the reference point the step the memory starts to being used.

I have been looking other methods for exploration like the Boltzmann one but I do not thing that for our simple tasks will suppose an improvement and imply greater changes. However, it could be done quickly.

Make tasks implement the gym interface

To achieve 100% compatibility in the framework between the tasks and the agents, we need to make our own tasks adhere to the requirements of the gym package. E.g., our tasks should be visualized using render().

Make tensorboard logs get their own directories

Currently it gets a mess when there are multiple logs in the log folder and you can't rename them. Would be nice if each log creates its own folder, named according to some settings of the experiment.

Create new Tasks

Create different tasks:

  • Race task (Vertical Scroller)

  • Evasive Task (Horizontal Scroller)

  • Evasion with bigger walls

  • Evasion with a "tunnel"

Double Check the Workings of all Networks

I noticed that in some cases, the use of activation functions appears very random. For example, in PixelEncoders we had sigmoid in the representation but not in the heads. The latter should definitely be the case though.

Check if input is as expected and throw meaningful errors.

Especially when using batches in the networks, there can be easy mistakes be made when feeding input. We should therefore at crucial places add checks that test whether the given format is what the following code expects. E.g. when feeding networks one can give a one-dimensional input. This should be prohibited if the network wants to deal with batches.

Tensorize everything

We already are tensorizing the batches and everything, but there are still functions like cast_float_tensor().
Furthermore, we have to make sure that the methods receive a tensor. See #23

  • Remove usage of cast_float_tensor()

  • Type hinting for tensors in learners and representation

  • Type hinting in pixelencoders (needed?)

  • Tensorize Pathing tasks

Document Code, include type hinting!

Its a bit difficult to always see what is returned and what format of input is required. To make thinks foolproof we should include docstrings and typehinting.

Modular Framework for different RL Approaches

Work on a modular framework for the history and parallel approach s.t. We can use and extend different approaches without having to reimplement repetitive parts again and again. Early work for q tables is already done. I will work on extending on that the following days.

Modular means having wrapper architectures to which we can give modules e.g.

  • q learner
  • representation learner
  • task environment

And the architecture combines them (history or parallel).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.