GithubHelp home page GithubHelp logo

aravindsiv / irl-lab Goto Github PK

View Code? Open in Web Editor NEW
22.0 22.0 6.0 13 KB

An implementation of popular Inverse Reinforcement Learning algorithms for various tasks.

License: MIT License

Python 20.82% Jupyter Notebook 79.18%

irl-lab's People

Contributors

aravindsiv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

irl-lab's Issues

Some IndexError occured when I run the sample usage code.

Hello irl-lab : )
I encountered some IndexError below. I tried to change those indices into Integer, but it didn't work.
Is there anything I didn't notice ?
Thanks a lot : )

IndexError                                Traceback (most recent call last)
<ipython-input-26-533cf5bf2336> in <module>()
      7 # Obtain the optimal policy for the environment to generate expert demonstrations
      8 pi = PolicyIteration(gw)
----> 9 optimal_policy = pi.policy_iteration(100)
     10 
     11 expert_demos = gw.generate_trajectory(optimal_policy)

C:\Documents\irl-lab-master\algo\PolicyIteration.py in policy_iteration(self, num_iters)
     16 
     17     def policy_iteration(self,num_iters=10):
---> 18         print("pi_")
     19         for i in range(num_iters):
     20             self.policy_evaluation()

C:\Documents\irl-lab-master\algo\PolicyIteration.py in policy_evaluation(self, num_iters, gamma)
     11             transition_probs = np.zeros((self.env.num_states,self.env.num_states))
     12             print(self.env.num_states)
---> 13             for j in range(self.env.num_states):
     14                 transition_probs[int(j)] = self.env.get_transition_probabilities(j,self.policy[int(j)])
     15             self.values = self.env.get_rewards() + gamma*np.dot(transition_probs,self.values)

C:\Documents\irl-lab-master\env\GridWorld.py in get_transition_probabilities(self, state, action)
     68             transition_probs[state_coords1,int(min(self.grid_size-1,state_coords2+1))] += 0.1
     69         elif action == "right":
---> 70             transition_probs[int(max(0,state_coords1-1)),state_coords2] += 0.1
     71             transition_probs[state_coords1,int(max(0,state_coords2-1))] += 0.1
     72             transition_probs[int(min(self.grid_size-1,state_coords1+1)),state_coords2] += 0.1

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices```

Reward not recovered

Hi Aravind,

I used your default setting. It looks like the objective has been maximized but the reward was not recovered.

Screen Shot 2020-06-10 at 10 37 00 PM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.