GithubHelp home page GithubHelp logo

sguttikon / deep-reinforcement-learning-for-adapting-dynamical-systems Goto Github PK

View Code? Open in Web Editor NEW

This project forked from erickrosete/deep-reinforcement-learning-for-adapting-dynamical-systems

0.0 0.0 0.0 17.59 MB

Augmenting dynamical systems by combining them with deep reinforcement learning in order to improve their final performance.

License: MIT License

Python 96.15% MATLAB 3.85%

deep-reinforcement-learning-for-adapting-dynamical-systems's Introduction

Deep Reinforcement Learning for Adapting Dynamical Systems

Description

In this work we study ways for augmenting dynamical systems by combining them with deep reinforcement learning in order to improve their performance in the context of contact rich insertion. We propose two approaches to adapt the dynamical systems, SAC-GMM-Residual and SAC-GMM.
SAC-GMM-Residual aims to learn a residual action on top of the dynamical system's action by exploring the environment safely throughout the learning process.
In contrast, SAC-GMM adapts the dynamical system's own parameter space to allow using its slightly modified version in the face of noisy observations.

SAC GMM Example

Installation

The package was tested using python 3.7 and Ubuntu 20.04 LTS.
To install use the package manager pip.

# At root of project
$ pip install -e .

Usage

Peg insertion environment

It consists in a PyBullet environment with a Sawyer robot. The goal of the task is to insert a peg into a box, the environment works similar to Gym from OpenAI environments.

from env.sawyer_peg_env import custom_sawyer_peg_env, register_sawyer_env

@hydra.main(config_path="../config", config_name="sac_gmm_config")
def main(cfg):
    env = custom_sawyer_peg_env(cfg.env)
    for episode in range(100):
        observation = env.reset()
        episode_length, episode_reward = 0,0
        for step in range(100):
            action = env.action_space.sample()
            observation, reward, done, info = env.step(action)
            if done:
                break

if __name__ == "__main__":
    register_sawyer_env()
    main()

The environment observation consists in a dictionary with different values that could be accessed when requested in the hydra env configuration file.
All the possible parameters are listed below

        observation_space["position"]
        observation_space["gripper_width"]
        observation_space["tactile_sensor"]
        observation_space["force"]
  • "position" contains the end effector cartesian pose
  • "gripper_width" defines the end effector distance between each finger
  • "tactile_sensor" contains the image measurement of each finger
  • "force" contains the force readings in each finger

Training process

To reproduce the results we provide the following guidelines to understand the training roadmap of each proposed model.

  • SAC
    • Train SAC
  • GMM
    • Saving demonstrations > Train GMM
  • SAC GMM Residual
    • Saving demonstrations > Train GMM > Train SAC GMM Residual
  • SAC GMM
    • Saving demonstrations > Train GMM > Train SAC GMM

A detailed explanation of each substep is provided below

Saving demonstrations

The demonstrations to learn the GMM are recorded by following a PD Controller, this program is defined to follow waypoints until inserting the peg. To save the demonstrations you need to execute the python file save_demonstrations.py you can adjust the configuration in this hydra file.
As output a folder with several demonstrations will be produced, each demonstration consist in a json file with the requested observation space. You can also use the json files to extract txt files that contain the pose and force in each timestep, a helper script is available on json_dem_to_txt.py.

$ # Saving json file demonstrations
$ python scripts/save_demonstrations.py number_demonstrations=20 output_dir="demonstrations/"
$ # Generate txt demonstrations
$ python scripts/json_dem_to_txt.py --input_dir="demonstrations/" --output_dir="demonstrations_txt/"

GMM

Gaussian Mixture Models are used to learn the dynamical system from demonstrations.
All the files related to this task are inside the GMM directory. The training of the GMM is based on ds-opt, the model is trained on MATLAB.
There are useful scripts to train and test the GMM inside this directory, further more the script gmm.py contain several utilities to use the GMM as a dynamical system, i.e. you can predict velocity by providing an observation, or modify the model by making a change in the parameter. Additionally, it supports loading GMM trained with either matlab or python.

$ # Train
$ python scripts/gmm_train.py model_name="models/GMM/gmm_peg"
$ # Test
$ python scripts/gmm_test.py model_names=["models/GMM/gmm_peg_pose_3.npy"] show_gui=True

As further reference, please see gmm_config.yaml which contain the default configuration parameters for both training and testing.

SAC

Soft Actor Critic is a state of the art reinforcement learning algorithm inside the Soft_Actor_Critic directory you could find the agent implementation, there are also Hyperparameter optimization routines and it is adapted to interact directly with the provided environment.

$ # Test
$ python scripts/sac_test.py
$ # Train
$ python scripts/sac_train.py

SAC GMM Residual

In this implementation SAC is capable of predicting a residual action on top of the GMM model prediction, such that it can uses high dimensional observations to adjust the final velocity predicted by the model. The main implementation code can be found in sac_gmm_residual_agent.py.

$ # Test
$ python scripts/sac_gmm_residual_test.py
$ # Train
$ python scripts/sac_gmm_residual_train.py

SAC GMM

In this code, SAC predicts a change in the parameters for the initial GMM configuration. Consequently, the modified GMM is used to interact with the environment throughout a window size. This allows the SAC agent to predict from a trajectory level perspective, the implementation can be found in sac_gmm_agent.py.

$ # Test
$ python scripts/sac_gmm_test.py
$ # Train
$ python scripts/sac_gmm_train.py

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
If you have any question please contact me through my email [email protected].

License

MIT

deep-reinforcement-learning-for-adapting-dynamical-systems's People

Contributors

erickrosete avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.