Comments (8)
Ok, I got your point and I agree with you. But what about adding a loss_fn
in the abstract base class for policies, that is basically doing nothing by default but can be overridden by the user? Because I really don't like the idea to have to overwrite 'learn' itself since it is a major source of error.
In this case, it is not a custom loss strictly speaking, but rather additional component to the original loss function (regularization), that may depend on the actor. So that it only consists in an extra function call before calling backward. I don't know if doing so is usual or not.
from tianshou.
That's a good question. Currently, you can either inherit the policy class (as you mentioned) or change the original framework's code to meet your expectations.
It can be discussed further. Some existing frameworks (like RLlib) modularized the loss function part. But in my opinion, this could be inconvenient for further development. Since the loss function is highly customizable, making the abstraction of the loss function will double the code complexity.
from tianshou.
@Trinkle23897 Up !
from tianshou.
@Trinkle23897 Up !
I have no time after #106 before this Friday...Many things to do
from tianshou.
No problem ! I can do it ! But what do you think about the idea ?
from tianshou.
I think that add loss_fn
is okay, but what's its input?
from tianshou.
@duburcqa It's a great idea to make it easier with a customized loss. I wondered if you have made any progress on that. Thanks!
from tianshou.
The loss is an integral part of the algorithm, so maybe inheriting and overriding is better than allowing users to pass custom losses. It's a central design question, I don't see it being necessary for the 1.0.0 release, but would keep the issue open
from tianshou.
Related Issues (20)
- [CQL] why subtract action logprob from Q? HOT 1
- No response after setting render HOT 5
- Multidimensional discrete action space with PPO or DQN HOT 2
- Use nbqa on notebooks HOT 2
- New html docs issue HOT 10
- Atari_PPO.py set frames_stack=1 can't run HOT 2
- Atari/Breakout render issue HOT 1
- Docu fix: `result = trainer.run()` HOT 2
- Fix CI on windows HOT 1
- puzzle about parameter set-eps HOT 1
- How to successfully run a demo HOT 12
- Hello, I want to use your platform to train the Unreal built external environment, is this possible? HOT 1
- Hierarchical Imitation Learning HOT 4
- Better default for batch_size in examples
- Centrally handle persistence of running mean/std for the normalization of observations
- Include parts of atari/mujoco helpers in package code HOT 1
- Support to Multi-node Training HOT 3
- How does the first test reward come before the first epoch? HOT 1
- action mask for DiscreteSACPolicy HOT 4
- question about adding another buffer HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tianshou.