Comments (5)
Get it, I hope your project go great.
再问下作者除了这里和知乎还有没有关于ElegantRL或者RL的讨论区?最近用强化学习做量化交易的试验,测试结果基本无法收敛深刻体会到了你说的智能体“自暴自弃”的感觉,我想它可能是惩罚我试图用强化学习解决非马尔科夫决策过程的行为,但还是想找人说道说道。
from elegantrl.
哈哈,可能是因为没啥好交流的吧,感觉很多人还是属于偶尔突发奇想后的玩票,用RL硬套量化交易。就像不断往炼丹炉里扔猴子,希望炼出个齐天大圣出来,这样恐怕得不到太理想的结果。我对交易有一些理解,在一些市场的行情走势上发现了一些关联,只是规律太过复杂人力无法掌握,想试试是否可以借助agent超越人脑对规律的掌握,所以这个模式似乎更类似游戏啊,没来错地方吧?
from elegantrl.
OK. In fact, there are more examples in folder 'BetaWarning'.
After I make sure there are not bug in those code. I will move these out of 'BetaWarning'.
感谢您的回复。我会多添加一些例子。这里使用的都是 OpenAI 的标准gym 环境。
我会以伯克利RISELab的强化学习库 Ray-rllib为目标,因为我个人认为他们的库比 stable-baselines 好用。
伯克利RISELab的强化学习库 Ray-rllib
事实上我已经在 'BetaWarning' 文件夹中添加了一些例子(以及新增了一些功能,多智能体,多线程... ...)。
我会在这些代码检查无误后将其移出 'BetaWarning' 文件夹。
(之前更新太过激进,导致私信被一堆人投诉,因此我现在希望谨慎一点)
Such as:
args.env_name = "Pendulum-v0" # It is easy to reach target score -200.0 (-100 is harder)
args.init_for_training()
train_offline_policy(**vars(args))
args.env_name = "LunarLanderContinuous-v2"
args.max_total_step = int(1e5 * 4)
args.init_for_training()
train_offline_policy(**vars(args))
args.env_name = "BipedalWalker-v3"
args.max_total_step = int(1e5 * 6)
args.init_for_training()
train_offline_policy(**vars(args))
# args.env_name = "BipedalWalkerHardcore-v3"
# args.net_dim = int(2 ** 8) # int(2 ** 8.5) #
# args.max_memo = int(2 ** 20)
# args.batch_size = int(2 ** 9)
# args.max_epoch = 2 ** 14
# args.reward_scale = int(2 ** 6.5)
# args.is_remove = None
# args.init_for_training()
# train_offline_policy(**vars(args))
#
# import pybullet_envs # for python-bullet-gym
# dir(pybullet_envs)
# args.env_name = "MinitaurBulletEnv-v0"
# args.max_epoch = 2 ** 13
# args.max_memo = 2 ** 20
# args.net_dim = 2 ** 9
# args.max_step = 2 ** 12
# args.batch_size = 2 ** 8
# args.reward_scale = 2 ** 3
# args.is_remove = True
# args.eva_size = 2 ** 5 # for Recorder
# args.show_gap = 2 ** 8 # for Recorder
# args.init_for_training()
# train_offline_policy(**vars(args))
#
# import pybullet_envs # for python-bullet-gym
# dir(pybullet_envs)
# args.env_name = "AntBulletEnv-v0"
# args.max_epoch = 2 ** 13
# args.max_memo = 2 ** 20
# args.max_step = 2 ** 10
# args.net_dim = 2 ** 8
# args.batch_size = 2 ** 8
# args.reward_scale = 2 ** -3
# args.is_remove = True
# args.eva_size = 2 ** 5 # for Recorder
# args.show_gap = 2 ** 8 # for Recorder
# args.init_for_training()
# train_offline_policy(**vars(args))
from elegantrl.
Compare reinforcement learning in quantitative trading with others, there is almost no community that discuss quantitative trading RL.
相比起RL其他领域,使用强化学习做量化交易的人群很难进行交流。
假若他们有新的发现,他们会因为一些限制而无法公开交流自己的观点,只能内部讨论。
我只认识一两个人用强化学习做量化交易,但是还没有遇见过高质量的量化交易RL讨论场所。
from elegantrl.
哈哈,可能是因为没啥好交流的吧,感觉很多人还是属于偶尔突发奇想后的玩票,用RL硬套量化交易。就像不断往炼丹炉里扔猴子,希望炼出个齐天大圣出来,这样恐怕得不到太理想的结果。我对交易有一些理解,在一些市场的行情走势上发现了一些关联,只是规律太过复杂人力无法掌握,想试试是否可以借助agent超越人脑对规律的掌握,所以这个模式似乎更类似游戏啊,没来错地方吧?
我这边也在做这块的研究,最近在看FinRL发现里头其实问题也不少,给他们提Issue反馈也不是很积极,大家多交流
from elegantrl.
Related Issues (20)
- none of your example works... HOT 1
- how to start with mujoco env?
- Implementation bug in Prioritized Experience Replay HOT 1
- mutil discrete action spaces
- Conditions to stop training when target return is reached
- run.py碰到一些问题 HOT 1
- MADDPG init issues HOT 1
- Isaac Gym Preview4 examples?
- Requirements completely inconsistent HOT 1
- How to get the value of account_value_erl
- How to get the value of account_value_erl HOT 1
- tutorial
- maybe a small bug in the function `explore_vec_env` of discretePPO and discreteA2C?
- Continue Training From Checkpoint
- where is train_and_evaluate function? HOT 1
- train_ppo_a2c_for_lunar_lander_continuous的ppo算法,好像不能完全复现曲线变化情况
- 已经下载好'./China_A_shares.pandas.dataframe',无法加载'./China_A_shares.pandas.dataframe',出现报错UnpicklingError
- SAC alpha update problem HOT 1
- QNet网络状态编码后是否需要加上激活函数?
- can't run MAPPO
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elegantrl.