GithubHelp home page GithubHelp logo

question about reward about maac HOT 10 CLOSED

shariqiqbal2810 avatar shariqiqbal2810 commented on June 29, 2024
question about reward

from maac.

Comments (10)

shariqiqbal2810 avatar shariqiqbal2810 commented on June 29, 2024

The code averages rewards over timesteps (25 steps in multi_speaker_listener), and the paper does not. So you need to multiply the rewards in the code by the number of timesteps to get the results in the paper (i.e. 6 * 25 = 125).

If your runs are reaching that level of rewards (around 6), then the speakers should be consistently reaching their targets. How are you checking this?

from maac.

ShuangLI59 avatar ShuangLI59 commented on June 29, 2024

Thanks for answering. Yes, I visualize the rendered image after training. Does the PyTorch/OpenAI baselines/OpenAI Gym version influence the performance?

from maac.

shariqiqbal2810 avatar shariqiqbal2810 commented on June 29, 2024

The fact that your runs are achieving that level of rewards indicates that they are training properly. Without more information I can't be sure what's wrong. Are you loading the parameters of the trained model before visualization? Can you share some examples of what the rendered images look like? Also, it would be useful to see the code you're using to visualize the policies.

from maac.

ShuangLI59 avatar ShuangLI59 commented on June 29, 2024

The vis code is similar to https://github.com/shariqiqbal2810/maddpg-pytorch/blob/master/evaluate.py. The generated results are.
epi0

from maac.

shariqiqbal2810 avatar shariqiqbal2810 commented on June 29, 2024

You should check whether the rollouts in evaluate.py are leading to the same amount of rewards that you see at the end of training. It's pretty clear that that is not the case here, which indicates there is a problem with how you are loading the parameters or something else along those lines.

from maac.

ShuangLI59 avatar ShuangLI59 commented on June 29, 2024

test.zip
This is the code I used to visualize.

from maac.

shariqiqbal2810 avatar shariqiqbal2810 commented on June 29, 2024

Sorry, I don't see anything that stands out as problematic in that code. Since you are getting good results during training, I would recommend trying to match the code within the training procedure as closely as possible and figuring out where the difference is.

from maac.

ShuangLI59 avatar ShuangLI59 commented on June 29, 2024

I see, so this is different from your testing results, right? Maybe there some bugs in my code.

from maac.

shariqiqbal2810 avatar shariqiqbal2810 commented on June 29, 2024

Yes, I was able to visualize successful trials where the listeners reach their targets, so I'm not exactly sure what's going wrong here. Good luck! I will close this issue for now, but feel free to comment if you have any other questions.

from maac.

ShuangLI59 avatar ShuangLI59 commented on June 29, 2024

Thanks a lot for your help!

from maac.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.