Light

How can I use a self-defined state in tianshou? about tianshou HOT 8 CLOSED

thu-ml commented on May 5, 2024

How can I use a self-defined state in tianshou?

from tianshou.

Comments (8)

Trinkle23897 commented on May 5, 2024 2

Btw, the state stored in the buffer may be a shallow copy. To make sure each of your graphs stored in the buffer is distinct, return the deep-copy version of your state in your env.

def reset():
    return copy.deepcopy(self.graph)
def step(a):
    ...
    return copy.deepcopy(self.graph), reward, done, {}

from tianshou.

Trinkle23897 commented on May 5, 2024

Do you mean https://github.com/networkx/networkx?

from tianshou.

Trinkle23897 commented on May 5, 2024

You could possibly hack the code. If your state is

{'graph': nx.Graph()}

then change this line

tianshou/tianshou/data/buffer.py

Line 156 in 1fce527

if name == 'info':

to

if name in ['info', 'obs', 'obs_next']:

This is to enable the dict storage in np.ndarray. And in your network side,

class NN(nn.Module):
    def __init__(self, ...):
        ...
    def forward(self, state, ...):
        graph = state['graph']
        (other operations)

from tianshou.

mrbeann commented on May 5, 2024

great idea! I'll try it.

from tianshou.

Trinkle23897 commented on May 5, 2024

I’ll add a commit tomorrow so that you could use nx state without hacking the codebase.

from tianshou.

Trinkle23897 commented on May 5, 2024

@mrbeann You can have a try with the current master codebase. The state could be either a dict {'g': nx.Graph()} or directly a graph nx.Graph().

from tianshou.

mrbeann commented on May 5, 2024

Yeah, it works now. Thanks!

from tianshou.

mrbeann commented on May 5, 2024

Thanks for the reminder.

from tianshou.

Related Issues (20)

[CQL] why subtract action logprob from Q? HOT 1
No response after setting render HOT 5
Multidimensional discrete action space with PPO or DQN HOT 2
Use nbqa on notebooks HOT 2
New html docs issue HOT 10
Atari_PPO.py set frames_stack=1 can't run HOT 2
Atari/Breakout render issue HOT 1
Docu fix: `result = trainer.run()` HOT 2
Fix CI on windows HOT 1
puzzle about parameter set-eps HOT 1
How to successfully run a demo HOT 12
Hello, I want to use your platform to train the Unreal built external environment, is this possible? HOT 1
Hierarchical Imitation Learning HOT 4
Better default for batch_size in examples
Centrally handle persistence of running mean/std for the normalization of observations
Include parts of atari/mujoco helpers in package code HOT 1
Support to Multi-node Training HOT 3
How does the first test reward come before the first epoch? HOT 1
action mask for DiscreteSACPolicy HOT 4
question about adding another buffer HOT 3

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs