notadamking / rltrader Goto Github PK

View Code? Open in Web Editor NEW

1.7K 1.7K 540.0 17.33 MB

A cryptocurrency trading environment using deep reinforcement learning and OpenAI's gym

Home Page: https://discord.gg/ZZ7BGWh

License: GNU General Public License v3.0

Python 82.10% Shell 6.20% Jupyter Notebook 11.70%

rltrader's People

Contributors

Stargazers

Watchers

Forkers

damonclifford konradbachusz-zz draichi kalolimpik panteo92 w0lv3r1nix zhuzhenping sulasen fhaynes hongthana veolata johndpope kevinkelley jbdatascience laranea adenot marvin-hansen botemple tomguluson92 wjsxlb2017 lihsin08 keithofaptos devas123 mzs0207 shu13720902 lemondy willwill85 wizcap savex83 maxmatical piyushver kasparasmasiukas limitium jdc08161063 pat1952 ajatodin markcheno aanwark jpv-costa liamcurry alvinjamur njligames lukemshannonhill litch stjordanis renatoshira 0xrad7 gabev augmen asherdiamant xyicheng sarikayamehmet jstesta ssgalitsky ericmanganaro mmoravec len5ky gabrielke op1490 archenroot kwhkim tnet halasnet yahehe blackivory amrotork rootm raphyduck danlg chetanmehra semamir andresn amilkov joanmassana pratd jivebread student1304 roche-emmanuel hao507 zhouzhengji designmalachi sandy1618 talhaasmal dongjunn markintoshz itsjbe trevorbasinger leolin1229 whiran ibury08 orenbaldinger bytefan rustleman jmpf2018 groninge01 peter101101 ruben-e adin-alihodzic mkarhade quanterstudio

rltrader's Issues

Optimal Hyper-parameters

Great work! Can you please post your best hyper-parameters found by optimize.py? I want to compare your results with mine.

Optuna

I have done following :

-> created an environment for the Bitcoin-Trader-RL
-> run requirements
-> tried to execute optimize.py

import error -> ImportError: No module named optuna

tried to pip install Optuna, all requirements satisfied, see below.

what could be the problem?

(Bitcoin-Trader-RL) root@vmanager6003:~/Bitcoin-Trader-RL-master# pip3 install optuna Requirement already satisfied: optuna in /usr/local/lib/python3.6/dist-packages Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: cliff in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: colorlog in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: typing in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: sqlalchemy>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: alembic in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: six in /usr/lib/python3/dist-packages (from optuna) Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from optuna) Requirement already satisfied: stevedore>=1.20.0 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) Requirement already satisfied: pbr!=2.1.0,>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) Requirement already satisfied: PyYAML>=3.12 in /usr/lib/python3/dist-packages (from cliff->optuna) Requirement already satisfied: PrettyTable<0.8,>=0.7.2 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) Requirement already satisfied: cmd2!=0.8.3; python_version >= "3.0" in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) Requirement already satisfied: pyparsing>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from cliff->optuna) Requirement already satisfied: Mako in /usr/local/lib/python3.6/dist-packages (from alembic->optuna) Requirement already satisfied: python-editor>=0.3 in /usr/local/lib/python3.6/dist-packages (from alembic->optuna) Requirement already satisfied: python-dateutil in /usr/local/lib/python3.6/dist-packages (from alembic->optuna) Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas->optuna) Requirement already satisfied: pyperclip>=1.5.27 in /usr/local/lib/python3.6/dist-packages (from cmd2!=0.8.3; python_version >= "3.0"->cliff->optuna) Requirement already satisfied: attrs>=16.3.0 in /usr/local/lib/python3.6/dist-packages (from cmd2!=0.8.3; python_version >= "3.0"->cliff->optuna) Requirement already satisfied: colorama in /usr/local/lib/python3.6/dist-packages (from cmd2!=0.8.3; python_version >= "3.0"->cliff->optuna) Requirement already satisfied: wcwidth>=0.1.7 in /usr/local/lib/python3.6/dist-packages (from cmd2!=0.8.3; python_version >= "3.0"->cliff->optuna) Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.6/dist-packages (from Mako->alembic->optuna) (Bitcoin-Trader-RL) root@vmanager6003:~/Bitcoin-Trader-RL-master# python ./optimize.py Traceback (most recent call last): File "./optimize.py", line 11, in <module> import optuna ImportError: No module named optuna (Bitcoin-Trader-RL) root@vmanager6003:~/Bitcoin-Trader-RL-master#

New Data Stream with Machine learning and RL process

I think that we should add Data stream to receive continuously new updated data, Train the model on the new data, check if there is model performance improves if yes, then we should update the model.

i found interesting post about it https://medium.com/analytics-vidhya/data-streams-and-online-machine-learning-in-python-a382e9e8d06a

ImportError: numpy.core._multiarray_umath failed to import ImportError: numpy.core.umath failed to import

Hey guys!
i'm getting the following error when trying to run optimize.py and training.py.

Have someone encountered similar issues?
help will be much appreciated.

thanks in advance

ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
ImportError: numpy.core.multiarray failed to import

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 980, in _find_and_load
SystemError: <class '_frozen_importlib._ModuleLockManager'> returned a result with an error set
ImportError: numpy.core._multiarray_umath failed to import
ImportError: numpy.core.umath failed to import
2019-06-14 15:02:55.452026: F tensorflow/python/lib/core/bfloat16.cc:675] Check failed: PyBfloat16_Type.tp_base != nullptr

(base) C:\Users\Nacho\Desktop\bots\Bitcoin-Trader-RL>python train.py
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
ImportError: numpy.core.multiarray failed to import

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 980, in _find_and_load
SystemError: <class '_frozen_importlib._ModuleLockManager'> returned a result with an error set
ImportError: numpy.core._multiarray_umath failed to import
ImportError: numpy.core.umath failed to import
2019-06-14 15:09:29.405024: F tensorflow/python/lib/core/bfloat16.cc:675] Check failed: PyBfloat16_Type.tp_base != nullptr

Error: `Tensor had NaN values` when trying to train

Hello,

I'm new to RL and gym. Thanks for open sourcing this.

I've installed the dependencies and tried to run with python main.py, but after a while I got a crash with the error: InvalidArgumentError (see above for traceback): Found Inf or NaN gobal norm. : Tensor had NaN values. This happened after 83400 total_timesteps.

Did I do something wrong? My machine has a 1080ti (thought it looks like it's hardly getting used for this, not sure if I need to do additional configuration to take advantage of the GPU, I don't see anything obvious about tensorflow device placement).

Thanks for any help you can provide.

Performance after date-issue-fix

So i'm a little confused with the repo right now and i'd like to understand what i'm working with since i'm not really achieving the results that were mentioned in the article.

#28 pointed out a possible problem with the data's integrity and i'd like to ask if anyone was able to recreate a profitable agent since. Although in that issue people have spoken about achieving pretty good results with the parameters that @notadamking has shared in #16 i'd like to confirm this since i was not able to replicate, all my agents fail and go into bankruptcy.

Following that question has anyone gotten to run optimize.py since that same bug fix, and if so would they mind sharing the optimal parameters to avoid re-running the said script for time-saving purposes.

I tried training my agents for 20 sessions but that didn't seem to be enough, should it?

Live trading implementation

Want to ask you if any one already integrated the code with any exchanger, as im planning to link it today or tomorrow with Bitmex to start testing how its will done.

I need to know if any one already done it? if yes how is the result ?

There is any proposal ?

Found Inf or NaN global norm. : Tensor had Inf values

While the optimize.py continue running, I observed one exception, but the process continue...

[W 2019-06-08 17:58:27,948] Setting status of trial#14 as TrialState.FAIL because of the following error: InvalidArgumentError()
Traceback (most recent call last):
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm. : Tensor had Inf values
	 [[{{node loss/VerifyFinite/CheckNumerics}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/study.py", line 399, in _run_trial
    result = func(trial)
  File "optimize.py", line 88, in optimize_agent
    model.learn(evaluation_interval)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/stable_baselines/ppo2/ppo2.py", line 326, in learn
    writer=writer, states=mb_states))
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/stable_baselines/ppo2/ppo2.py", line 257, in _train_step
    td_map)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm. : Tensor had Inf values
	 [[node loss/VerifyFinite/CheckNumerics (defined at /home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/stable_baselines/ppo2/ppo2.py:175) ]]

Caused by op 'loss/VerifyFinite/CheckNumerics', defined at:
  File "/usr/lib64/python3.6/threading.py", line 884, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib64/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib64/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/study.py", line 357, in func_child_thread
    self._run_trial(func, catch)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/study.py", line 399, in _run_trial
    result = func(trial)
  File "optimize.py", line 81, in optimize_agent
    tensorboard_log="./tensorboard", **model_params)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/stable_baselines/ppo2/ppo2.py", line 93, in __init__
    self.setup_model()
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/stable_baselines/ppo2/ppo2.py", line 175, in setup_model
    grads, _grad_norm = tf.clip_by_global_norm(grads, self.max_grad_norm)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/ops/clip_ops.py", line 271, in clip_by_global_norm
    "Found Inf or NaN global norm.")
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/ops/numerics.py", line 44, in verify_tensor_all_finite
    return verify_tensor_all_finite_v2(t, msg, name)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/ops/numerics.py", line 62, in verify_tensor_all_finite_v2
    verify_input = array_ops.check_numerics(x, message=message)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 919, in check_numerics
    "CheckNumerics", tensor=tensor, message=message, name=name)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Found Inf or NaN global norm. : Tensor had Inf values
	 [[node loss/VerifyFinite/CheckNumerics (defined at /home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/stable_baselines/ppo2/ppo2.py:175) ]]

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked

While on 32 thread system I set numpy task from 4 to 32 and started getting:

[W 2019-06-08 15:21:37,332] Setting status of trial#10 as TrialState.FAIL because of the following error: OperationalError('(sqlite3.OperationalError) database is locked',)
Traceback (most recent call last):
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 550, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: database is locked

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/study.py", line 399, in _run_trial
    result = func(trial)
  File "optimize.py", line 79, in optimize_agent
    model_params = optimize_ppo2(trial)
  File "optimize.py", line 62, in optimize_ppo2
    'n_steps': int(trial.suggest_loguniform('n_steps', 16, 2048)),
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/trial.py", line 207, in suggest_loguniform
    return self._suggest(name, distribution)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/trial.py", line 440, in _suggest
    distribution)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/samplers/tpe/sampler.py", line 73, in sample
    observation_pairs = storage.get_trial_param_result_pairs(study_id, param_name)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/storages/base.py", line 209, in get_trial_param_result_pairs
    all_trials = self.get_all_trials(study_id)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/storages/rdb/storage.py", line 497, in get_all_trials
    trials = self._get_all_trials_without_cache(study_id)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/storages/rdb/storage.py", line 519, in _get_all_trials_without_cache
    study = models.StudyModel.find_or_raise_by_id(study_id, session)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/storages/rdb/models.py", line 56, in find_or_raise_by_id
    study = cls.find_by_id(study_id, session)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/optuna/storages/rdb/models.py", line 48, in find_by_id
    study = session.query(cls).filter(cls.study_id == study_id).one_or_none()
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3250, in one_or_none
    ret = list(self)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3323, in __iter__
    return self._execute_and_instances(context)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3348, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
    raise value.with_traceback(tb)
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/home/zangetsu/proj/prometheus-core/demo/demo-12-bitcoin-trading-agent/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 550, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
[SQL: SELECT studies.study_id AS studies_study_id, studies.study_name AS studies_study_name, studies.direction AS studies_direction 
FROM studies 
WHERE studies.study_id = ?]
[parameters: (1,)]
(Background on this error at: http://sqlalche.me/e/e3q8)

Its because of the sqlite limited concurency:

SQLite is meant to be a lightweight database, and thus can't support a high level of concurrency. OperationalError: database is locked errors indicate that your application is experiencing more concurrency than sqlite can handle in default configuration. This error means that one thread or process has an exclusive lock on the database connection and another thread timed out waiting for the lock the be released.

Python's SQLite wrapper has a default timeout value that determines how long the second thread is allowed to wait on the lock before it times out and raises the OperationalError: database is locked error.

If you're getting this error, you can solve it by:

Switching to another database backend. At a certain point SQLite becomes too "lite" for real-world applications, and these sorts of concurrency errors indicate you've reached that point.

Rewriting your code to reduce concurrency and ensure that database transactions are short-lived.

Increase the default timeout value by setting the timeout database option.

So I created local instance of Postgresql:

create database hyperparamdb;
create user hyperparamdb with encrypted password 'hyperparamdb';
grant all privileges on database hyperparamdb to hyperparamdb;

postgresql://hyperparamdb:hyperparamdb@localhost/hyperparamdb


CREATE ROLE zangetsu SUPERUSER;
ALTER ROLE "zangetsu" WITH LOGIN;

And I configured optane with new URL everywhere:
params_db_file = 'postgresql://hyperparamdb:hyperparamdb@localhost/hyperparamdb'

Postgres handle quite nicely this, so it works on 40 tasks now.

test issue

How to visualize test? When I run python ./test.py, nothing output. Thanks!

Understanding _current_price()

In _current_price method in BitcoinEnvironment (sorry can’t link on mobile) we get the close value for some number of steps ahead.

But isn’t the current price supposed to be the current one at which we can buy or sell?

Use of training data and multiple episodes

I have the following two questions about optimize.py :

The model appears to be trained always in the first portion of train_df, from 0 to len(len(train_df) / n_evaluations. Are the other portions of train_df used in the loop over n_evaluations?

Why is an evaluation of n_test_episodes required for calculating a mean reward?

Thank you.

why was _reset_session removed from the environment?

In the earlier version of the environment, there was a

def _reset_session(self)

that allowed the environment to start at a random point in the dataframe to get more training samples. why was this removed for the newest version of the btc environment?

requirements.txt

(base) C:\Users\Nacho\Desktop\bots\Bitcoin-Trader-RL>conda install --file requirements.txt
Collecting package metadata: done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

statsmodels==0.10.0rc2
stable_baselines
gym
sklearn
empyrical
optuna
ta

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

SARIMAX prediction LinAlgError: Non-positive definite forecast

[W 2019-06-07 12:02:31,288] Setting status of trial#6 as TrialState.FAIL because of the following error: LinAlgError('Non-positive-definite forecast error covariance matrix encountered at period 1')
Traceback (most recent call last):
File "_inversions.pyx", line 1107, in statsmodels.tsa.statespace._filters._inversions.zinverse_univariate
ZeroDivisionError: float division

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/optuna/study.py", line 399, in _run_trial
result = func(trial)
File "./optimize.py", line 80, in optimize_agent
model.learn(evaluation_interval)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/ppo2/ppo2.py", line 277, in learn
runner = Runner(env=self.env, model=self, n_steps=self.n_steps, gamma=self.gamma, lam=self.lam)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/ppo2/ppo2.py", line 399, in init
super().init(env=env, model=model, n_steps=n_steps)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/common/runners.py", line 19, in init
self.obs[:] = env.reset()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/stable_baselines/common/vec_env/dummy_vec_env.py", line 45, in reset
obs = self.envs[env_idx].reset()
File "/Users/XXXX/Downloads/Bitcoin-Trader-RL-master/env/BitcoinTradingEnv.py", line 182, in reset
return self._next_observation()
File "/Users/XXXX/Downloads/Bitcoin-Trader-RL-master/env/BitcoinTradingEnv.py", line 81, in _next_observation
model_fit = forecast_model.fit(method='bfgs', disp=False)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 469, in fit
skip_hessian=True, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/base/model.py", line 466, in fit
full_output=full_output)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/base/optimizer.py", line 191, in _fit
hess=hessian)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/base/optimizer.py", line 327, in _fit_bfgs
disp=disp, retall=retall, callback=callback)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 942, in fmin_bfgs
res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 996, in _minimize_bfgs
gfk = myfprime(x0)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 326, in function_wrapper
return function(*(wrapper_args + args))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/base/model.py", line 451, in score
return -self.score(params, *args) / nobs
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 1077, in score
score = self._score_complex_step(params, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 944, in _score_complex_step
kwargs=kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tools/numdiff.py", line 202, in approx_fprime_cs
for i, ih in enumerate(increments)]
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tools/numdiff.py", line 202, in
for i, ih in enumerate(increments)]
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tsa/statespace/mlemodel.py", line 646, in loglike
loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tsa/statespace/kalman_filter.py", line 825, in loglike
kfilter = self._filter(**kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/statsmodels/tsa/statespace/kalman_filter.py", line 750, in _filter
kfilter()
File "_kalman_filter.pyx", line 3632, in statsmodels.tsa.statespace._kalman_filter.zKalmanFilter.call
File "_kalman_filter.pyx", line 3698, in statsmodels.tsa.statespace._kalman_filter.zKalmanFilter.next
File "_inversions.pyx", line 1109, in statsmodels.tsa.statespace._filters._inversions.zinverse_univariate
numpy.linalg.LinAlgError: Non-positive-definite forecast error covariance matrix encountered at period 1

Improve Utilization of GPU

This library achieves very high success rates, though it takes a very long time to optimize and train. This could be improved if we could figure out a way to utilize the GPU more during optimization/training, so the CPU can be less of a bottleneck. Currently, the CPU is being used for most of the intermediate environment calculations, while the GPU is used within the PPO2 algorithm during policy optimization.

I am currently optimizing/training on the following hardware:

AMD Threadripper 1920X 12 Core (24 Thread) CPU
Nvidia RTX 2080 8GB GPU
16 GB 3000 Mhz RAM

The bottleneck on my system is definitely the CPU, which is surprising as this library takes advantage of the multi-threaded benefits of the Threadripper, and my GPU is staying around 1-10% utilization. I have some ideas on how this could be improved, but would like to start a conversation.

Increase the size of the policy network (i.e. increase the number of hidden layers or increase the number of nodes in each layer)
Do less work in each training loop, so the GPU loop is called more often.

I would love to hear what you guys think. Any ideas or knowledge is welcome to be shared here.

How to avoid overtrading?

I ran your code and found the trading (buy/sell) frequency were higher than 50%. Maybe we can think the agent is trying to explore the environment, but overtrading caused a high transaction cost and made the net worth shrank quickly.

Any discussion is welcome.

Issue running optimize

(tf_gpu) C:\Users\idf\Documents\python\Bitcoin-Trader-RL>python ./test.py
Traceback (most recent call last):
File "./test.py", line 5, in
from stable_baselines.common.policies import MlpLnLstmPolicy
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\stable_baselines_init_.py", line 4, in
from stable_baselines.ddpg import DDPG
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\stable_baselines\ddpg_init_.py", line 1, in
from stable_baselines.ddpg.ddpg import DDPG
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\stable_baselines\ddpg\ddpg.py", line 12, in
from mpi4py import MPI
ImportError: DLL load failed: The specified procedure could not be found.

(tf_gpu) C:\Users\idf\Documents\python\Bitcoin-Trader-RL>

Reasonable balances for trading

Currently, agents buy and sell using unrealistic USD and BTC balances.
For instance, during a debugging session I saw one agent executing this kind of trades:

Buying BTC
balance USD: 0.004812330813060336 btc held: 2.5835011089138864
Net worth: 17707.373082848768
=============
Buying BTC
balance USD: 0.0032082205420402242 btc held: 2.5835013414545522
Net worth: 17777.0501037559

It doesn't look like there is a logical lower bound on the USD and BTC that can be used in a trade. In real life one wouldn't try to (and couldn't) buy BTC with $0.0016.

Also, using correct number of decimals (i.e., 8 for BTC) is important for precision. That might be another bug actually.

Sortino agent overtrading

The blog post suggests the Sortino reward function avoids over-trading, but on both the original and the fixed date dataset I see wild trading rates close to 50% that invariably result in a loss of more than to 50% of the starting budget.

This is by running train.py right out of git clone. Not sure if a bug or if I'm missing something.

Potential issue with scaling in _next_observation

When scaling the technical analysis time series features in BitcoinTradingEnv.py, it seems min-max scaling is applied to scaled in a "streaming" fashion - that is, at every iteration when new data comes in.

The problem with that is, the minimum and maximum of an array of the last x values will not necessarily be the minimum and maximum of the entire training dataset, which I believe it should be for scaling.

For example, imagine you had the most recent 5 values of the Close price

[9.98, 9.71, 9.31, 8.59, 8.93]

The current implementation will scale this vector only by the minimum of 8.59 and the maximum of 9.98. However, the entire Close price training vector may have a minimum far less than 8.59 and a maximum far larger than 9.98.

I believe scaling should be learned from, and then applied to, the entire training dataset. Then the learned scaler can be applied to new testing / live data as it streams in as outlined here. If new values come in that are smaller / larger than the training data min / max, then the scaler would need to be trained again with the new data.

TensorFlow Fail

After resolving all the issues with requirements and tensorflow I have this error afte running opthymize.pi

(C:\Users\USER-KIN-00381\Anaconda3\envs\crypt) D:\BTR>python ./optimize.py
[I 2019-06-12 15:42:39,174] A new study created with name: ppo2_sortino
WARNING:tensorflow:From C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\stable_baselines\common\policies.py:420: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
WARNING:tensorflow:From C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
[W 2019-06-12 15:44:03,396] Setting status of trial#2 as TrialState.FAIL because of the following error: NotFoundError()
Traceback (most recent call last):
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\optuna\study.py", line 399, in _run_trial
result = func(trial)
File "./optimize.py", line 89, in optimize_agent
model.learn(evaluation_interval)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 273, in learn
with SetVerbosity(self.verbose), TensorboardWriter(self.graph, self.tensorboard_log, tb_log_name, new_tb_log)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\stable_baselines\common\base_class.py", line 693, in enter
self.writer = tf.summary.FileWriter(save_path, graph=self.graph)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\summary\writer\writer.py", line 367, in init
filename_suffix)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\summary\writer\event_file_writer.py", line 67, in init
gfile.MakeDirs(self._logdir)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 442, in recursive_create_dir
recursive_create_dir_v2(dirname)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 458, in recursive_create_dir_v2
pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(path), status)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Failed to create a directory: ./tensorboard\PPO2_1; No such file or directory
[W 2019-06-12 15:44:32,769] Setting status of trial#1 as TrialState.FAIL because of the following error: NotFoundError()
Traceback (most recent call last):
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\optuna\study.py", line 399, in _run_trial
result = func(trial)
File "./optimize.py", line 89, in optimize_agent
model.learn(evaluation_interval)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 273, in learn
with SetVerbosity(self.verbose), TensorboardWriter(self.graph, self.tensorboard_log, tb_log_name, new_tb_log)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\stable_baselines\common\base_class.py", line 693, in enter
self.writer = tf.summary.FileWriter(save_path, graph=self.graph)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\summary\writer\writer.py", line 367, in init
filename_suffix)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\summary\writer\event_file_writer.py", line 67, in init
gfile.MakeDirs(self._logdir)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 442, in recursive_create_dir
recursive_create_dir_v2(dirname)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 458, in recursive_create_dir_v2
pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(path), status)
File "C:\Users\USER-KIN-00381\Anaconda3\envs\crypt\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Failed to create a directory: ./tensorboard\PPO2_1; No such file or directory

Adding third action: hold

Right now there are only two possible actions: buy and sell. Should we add a third which would be HOLD or do nothing? What if our agent has no position and the price is headed down?

Just thinking out loud here.

There is a confidence interval though, right? So we may not want to make a trade if confidence interval isn't of a sufficient magnitude, so maybe we just have a sort of implied third action instead of an explicit third action of HOLD:

if abs(confidence_interval) > x:
    placeTrade()

Feel free to close if this is a dumb idea 😛

Speed and memory requirements optimizing.

Hi, i am running new optimization against a dataset in which the sorting is done correctly (with regard to the AM/PM bug in sorting). And i noticed that when i run optimization with
1 job, 1 trial, 1 episode and 1 eval. It takes about 13 minutes and 4G of memory to complete the cycle (Added some logging to the optimize.py) :

DEBUG:main: input_data_file: data/coinbase_hourly.csv
DEBUG:main: Dropping 'Symbol' column from dataframe
DEBUG:main: Converting 'Date' column to type datetime
DEBUG:main: Sorting dataframe by 'Date' column
DEBUG:main: Adding indicators to dataframe
INFO:main: Reward strategy : sortino
INFO:main: Input data file : data/coinbase_hourly.csv
INFO:main: Parames db file : sqlite:///params.db
INFO:main: n_jobs = 1
INFO:main: n_trials = 1
INFO:main: n_test_episodes = 1
INFO:main: n_evaluations = 1
INFO:main: Total number of records : 13144
INFO:main: Number of training recs : 13144
INFO:main: Num. of evaluation recs : 10515
...
DEBUG:main: Running 1 episodes
DEBUG:main: Started optimization : 2019-06-17 12-52-49
DEBUG:main: Finished optimization: 2019-06-17 13:06:26
DEBUG:main: Total trial time : 0:13:36.920658
INFO:optuna.study:Finished trial#2 resulted in value: 1886.0362548828125. Current best value is 0.0 with parameters: {'cliprang$
Number of finished trials: 1
Best trial:
Value: 0.0
Params:
cliprange: 0.34253769458062744
confidence_interval: 0.8379242988756598
ent_coef: 3.63811777280501e-07
forecast_len: 1.1511618552611296
gamma: 0.9913080137473371
lam: 0.9312156811763672
learning_rate: 0.027396718735216347
n_steps: 122.04674817782056
noptepochs: 1.4269422834689458
DEBUG:main: Total run time (1 jobs, 1 trials, 1 evals, 1 episodes) : 0:13:37.238973

When i run the same script with n_trials = 3, i would excpect that it takes 3*13.37 minutes. However after about 1 hour, my computer froze with 8GB ram filled up, and 7.5G swapping to disk. (no other processes running).

That leads me to two questions:

When running multiple trials on modest hardware (4-core i7 , 8GB RAM , no GPU) would it be wise to loop through optimize() :

for i in range(0..number_of_trials):
optimize()

with n_trials in optimize() defined as 1 instead of:

 study.optimize(optimize_agent, n_trials=n_trials, n_jobs=n_jobs)

Since the results are just added tot params.db

What would a decent set of hardware be to run say 100 trials. And how many trials would it take to return a reasonably valid set of hyperparameters?

Results vary wildly between hourly and daily datasets

Is this happening for anyone else?

The hourly dataset returns a ~1000% profit, but when I run the trained agent against the daily dataset I see a return of -15-20%

It doesn't appear to be a code issue as everything runs correctly, but I'm curious if anyone else is seeing this variance in performance

That line "df = df.sort_values([‘Date’])" didn’t quite work

origin link: https://link.medium.com/4oFNMv1TxX
Hmm. I got some crazy numbers (like hundreds percent). Then tried to debug your code and found that df = df.sort_values([‘Date’]) didn’t quite work for me (maybe you have some fancy pandas?) In my case it sorted dates alphabetically like

2019–05–16 02-AM
2019–05–16 02-PM
2019–05–16 03-AM
2019–05–16 03-PM

I tried simple fix (just inverted rows as they are already in order in a csv file) and then the same model started to show values like -11% and such.

But I think I should fix the trainer also and retrain the model before retesting.

optimize error

hey guys!
When runnng the optimizer i get the following error:
any ideas?

[I 2019-06-14 15:34:12,654] A new study created with name: ppo2_sortino
WARNING:tensorflow:From C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\common\policies.py:420: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
WARNING:tensorflow:From C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
[I 2019-06-14 17:27:07,947] Finished trial#1 resulted in value: -19551.509765625. Current best value is -19551.509765625 with parameters: {'cliprange': 0.38113219124165043, 'confidence_interval': 0.8556293119927123, 'ent_coef': 0.00013022093535861756, 'forecast_len': 3.0216790310549677, 'gamma': 0.9299473170995376, 'lam': 0.8620846641431216, 'learning_rate': 0.008580823341236382, 'n_steps': 181.9906757748466, 'noptepochs': 19.321405513375872}.
[I 2019-06-14 17:39:50,201] Finished trial#3 resulted in value: -8186.02001953125. Current best value is -19551.509765625 with parameters: {'cliprange': 0.38113219124165043, 'confidence_interval': 0.8556293119927123, 'ent_coef': 0.00013022093535861756, 'forecast_len': 3.0216790310549677, 'gamma': 0.9299473170995376, 'lam': 0.8620846641431216, 'learning_rate': 0.008580823341236382, 'n_steps': 181.9906757748466, 'noptepochs': 19.321405513375872}.
[I 2019-06-14 17:49:38,959] Finished trial#2 resulted in value: 21770.931640625. Current best value is -19551.509765625 with parameters: {'cliprange': 0.38113219124165043, 'confidence_interval': 0.8556293119927123, 'ent_coef': 0.00013022093535861756, 'forecast_len': 3.0216790310549677, 'gamma': 0.9299473170995376, 'lam': 0.8620846641431216, 'learning_rate': 0.008580823341236382, 'n_steps': 181.9906757748466, 'noptepochs': 19.321405513375872}.
[W 2019-06-14 19:18:15,197] Setting status of trial#0 as TrialState.FAIL because of the following error: ResourceExhaustedError()
Traceback (most recent call last):
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call
return fn(*args)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[64,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node loss/gradients/train_model/model/MatMul_236_grad/MatMul_1}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Nacho\Anaconda3\lib\site-packages\optuna\study.py", line 399, in _run_trial
result = func(trial)
File "./optimize.py", line 90, in optimize_agent
model.learn(evaluation_interval)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 369, in learn
cliprange_vf=cliprange_vf_now))
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 297, in _train_step
td_map)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
run_metadata_ptr)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
run_metadata)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[64,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node loss/gradients/train_model/model/MatMul_236_grad/MatMul_1 (defined at C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py:205) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'loss/gradients/train_model/model/MatMul_236_grad/MatMul_1', defined at:
File "C:\Users\Nacho\Anaconda3\lib\threading.py", line 885, in _bootstrap
self._bootstrap_inner()
File "C:\Users\Nacho\Anaconda3\lib\threading.py", line 917, in _bootstrap_inner
self.run()
File "C:\Users\Nacho\Anaconda3\lib\threading.py", line 865, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Nacho\Anaconda3\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\Nacho\Anaconda3\lib\site-packages\optuna\study.py", line 357, in func_child_thread
self._run_trial(func, catch)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\optuna\study.py", line 399, in _run_trial
result = func(trial)
File "./optimize.py", line 83, in optimize_agent
tensorboard_log=Path("./tensorboard").name, **model_params)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 100, in init
self.setup_model()
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 205, in setup_model
grads = tf.gradients(loss, self.params)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 664, in gradients
unconnected_gradients)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 965, in _GradientsHelper
lambda: grad_fn(op, *out_grads))
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 420, in _MaybeCompile
return grad_fn() # Exit early
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 965, in
lambda: grad_fn(op, *out_grads))
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\math_grad.py", line 1132, in _MatMulGrad
grad_b = gen_math_ops.mat_mul(a, grad, transpose_a=True)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 5630, in mat_mul
name=name)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

...which was originally created as op 'train_model/model/MatMul_236', defined at:
File "C:\Users\Nacho\Anaconda3\lib\threading.py", line 885, in _bootstrap
self._bootstrap_inner()
[elided 6 identical lines from previous traceback]
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 100, in init
self.setup_model()
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py", line 138, in setup_model
reuse=True, **self.policy_kwargs)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\common\policies.py", line 701, in init
layer_norm=True, feature_extraction="mlp", **_kwargs)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\common\policies.py", line 427, in init
layer_norm=layer_norm)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\a2c\utils.py", line 231, in lstm
+ _ln(tf.matmul(hidden, weight_h), gain_h, bias_h) + bias
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 2455, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 5630, in mat_mul
name=name)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
File "C:\Users\Nacho\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[64,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node loss/gradients/train_model/model/MatMul_236_grad/MatMul_1 (defined at C:\Users\Nacho\Anaconda3\lib\site-packages\stable_baselines\ppo2\ppo2.py:205) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

ValueError: The provided tag was already used for this event type

I am running this code in google colab to take advantage of the GPU's, but every time I run it, I get the error in the title.

At the beginning of the notebook, I install all the dependancies

!git clone https://github.com/notadamking/Bitcoin-Trader-RL.git
!pip install ta
!pip install empyrical
!pip install optuna
!pip install gym
!pip install stable-baselines
!pip install scikit-learn
!pip install tensorflow-gpu

....

!python3 optimize.py


[I 2019-06-09 08:56:20,043] A new study created with name: ppo2_sortino
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/stable_baselines/common/input.py:26: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/stable_baselines/common/policies.py:195: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
[W 2019-06-09 08:58:47,691] Setting status of trial#0 as TrialState.FAIL because of the following error: ValueError('The provided tag was already used for this event type',)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/optuna/study.py", line 399, in _run_trial
    result = func(trial)
  File "optimize.py", line 88, in optimize_agent
    model.learn(evaluation_interval)
  File "/usr/local/lib/python3.6/dist-packages/stable_baselines/ppo2/ppo2.py", line 307, in learn
    writer=writer, states=mb_states))
  File "/usr/local/lib/python3.6/dist-packages/stable_baselines/ppo2/ppo2.py", line 245, in _train_step
    writer.add_run_metadata(run_metadata, 'step%d' % (update * update_fac))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/summary/writer/writer.py", line 264, in add_run_metadata
    raise ValueError("The provided tag was already used for this event type")
ValueError: The provided tag was already used for this event type

The notebook crashes shortly after a 3-5 of the same errors are raised for every bayesian optimisation trial. I am unsure if this is an optuna error or a tensorflow error. I couldn't find much info online regarding the error. Any idea what may be causing this and how to fix it?

Improve documentation and bootstrapping of new agents

Traceback (most recent call last):
File "/Users/XXXX/Downloads/Bitcoin-Trader-RL-master/train.py", line 15, in
storage='sqlite:///params.db')
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/optuna/study.py", line 557, in load_study
return Study(study_name=study_name, storage=storage, sampler=sampler, pruner=pruner)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/optuna/study.py", line 78, in init
self.study_id = self.storage.get_study_id_from_name(study_name)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/optuna/storages/rdb/storage.py", line 165, in get_study_id_from_name
study = models.StudyModel.find_or_raise_by_name(study_name, session)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/optuna/storages/rdb/models.py", line 76, in find_or_raise_by_name
raise ValueError(NOT_FOUND_MSG)
ValueError: Record does not exist.

GPU Underutilization

I am currently optimizing/training on the following hardware:

AMD Threadripper 1920X 12 Core (24 Thread) CPU
Nvidia RTX 2080 8GB GPU
16 GB 3000 Mhz RAM

Increase the size of the policy network (i.e. increase the number of hidden layers or increase the number of nodes in each layer)
Do less work in each training loop, so the GPU loop is called more often.

I would love to hear what you guys think. Any ideas or knowledge is welcome to be shared here.

Memory Leak: std::bad_alloc error during optimize

I've gotten this a couple times now after python ./optimize.py:

Arch Linux
Python 3.7.3

...
[I 2019-06-11 20:16:49,493] Setting status of trial#51 as TrialState.PRUNED. 
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
[computer:74848] *** Process received signal ***
[computer:74848] Signal: Aborted (6)
[computer:74848] Signal code:  (-6)
[computer:74848] [ 0] /usr/lib/libpthread.so.0(+0x124d0)[0x7ff4e9dce4d0]
[computer:74848] [ 1] /usr/lib/libc.so.6(gsignal+0x10f)[0x7ff4e9c2e82f]
[computer:74848] [ 2] /usr/lib/libc.so.6(abort+0x125)[0x7ff4e9c19672]
[computer:74848] [ 3] /usr/lib/libstdc++.so.6(+0x8a58e)[0x7ff4a95f358e]
[computer:74848] [ 4] /usr/lib/libstdc++.so.6(+0x90e0a)[0x7ff4a95f9e0a]
[computer:74848] [ 5] /usr/lib/libstdc++.so.6(+0x90e67)[0x7ff4a95f9e67]
[computer:74848] [ 6] /usr/lib/libstdc++.so.6(+0x910bc)[0x7ff4a95fa0bc]
[computer:74848] [ 7] /usr/lib/libstdc++.so.6(+0x91647)[0x7ff4a95fa647]
[computer:74848] [ 8] /usr/lib/libstdc++.so.6(_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_assignERKS4_+0xb0)[0x7ff4a9692760]
[computer:74848] [ 9] /usr/lib/libprotobuf.so.18(_ZNK6google8protobuf8internal26GeneratedMessageReflection9SetStringEPNS0_7MessageEPKNS0_15FieldDescriptorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1be)[0x7ff41f
a99e6e]
[computer:74848] [10] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python17CheckAndSetStringEP7_objectPNS0_7MessageEPKNS0_15FieldDescriptorEPKNS0_10Reflec
tionEbi+0x13a)[0x7ff41fc178da]
[computer:74848] [11] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python8cmessage25InternalSetNonOneofScalarEPNS0_7MessageEPKNS0_15FieldDescriptorEP7_obj
ect+0xeb)[0x7ff41fc179eb]
[computer:74848] [12] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python8cmessage13SetFieldValueEPNS1_8CMessageEPKNS0_15FieldDescriptorEP7_object+0x91)[0
x7ff41fc17de1]
[computer:74848] [13] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python8cmessage14InitAttributesEPNS1_8CMessageEP7_objectS6_+0x229)[0x7ff41fc18aa9]
[computer:74848] [14] /usr/lib/libpython3.7m.so.1.0(_PyObject_FastCallKeywords+0x11c)[0x7ff4e9a0039c]
[computer:74848] [15] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x5951)[0x7ff4e9a454b1]
[computer:74848] [16] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalCodeWithName+0x2f9)[0x7ff4e998cd09]
[computer:74848] [17] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallKeywords+0x2b2)[0x7ff4e99d3882]
[computer:74848] [18] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x4d2)[0x7ff4e9a40032]
[computer:74848] [19] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalCodeWithName+0x2f9)[0x7ff4e998cd09]
[computer:74848] [20] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallKeywords+0x2b2)[0x7ff4e99d3882]
[computer:74848] [21] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x4b8a)[0x7ff4e9a446ea]
[computer:74848] [22] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalCodeWithName+0x2f9)[0x7ff4e998cd09]
[computer:74848] [23] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallDict+0x2ec)[0x7ff4e998df8c]
[computer:74848] [24] /usr/lib/libpython3.7m.so.1.0(_PyObject_Call_Prepend+0x68)[0x7ff4e999d818]
[computer:74848] [25] /usr/lib/libpython3.7m.so.1.0(+0x16d0e3)[0x7ff4e99ec0e3]
[computer:74848] [26] /usr/lib/libpython3.7m.so.1.0(_PyObject_FastCallKeywords+0x11c)[0x7ff4e9a0039c]
[computer:74848] [27] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x5951)[0x7ff4e9a454b1]
[computer:74848] [28] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallDict+0x11b)[0x7ff4e998ddbb]
[computer:74848] [29] /usr/lib/libpython3.7m.so.1.0(_PyObject_Call_Prepend+0x68)[0x7ff4e999d818]
[computer:74848] *** End of error message ***
Aborted (core dumped)

Generation of Visualization Graph

Looking for How to generate the different Graph as on the article to get idea of the RL work ?

Please if you can add How to and what file to use ?

[Suggestion] - Feed data from multiple exchanges

I believe if you feed the data from multiple exchanges into the model, it will learn to predict price movements better, based on patterns it has found in how the price change affects the other exchanges.

QuantConnect Backtest, Paper & Live Trading Results

Hey Adam,

great work and well written article. Thank you for sharing.

Would you consider implementing your system on QauntConnect?

https://www.quantconnect.com/

You may or may not open source code, but sharing the backtest and
paper trading results adds a ton of credibility to your work. All it takes is to implement a scheduled event handler.

https://www.quantconnect.com/docs/key-concepts/developing-in-the-ide

My last algo still runs in paper trading, but the insights gained from how it performs under real market conditions were critical to developing the next generation.

Cheers and keep up the amazing work.

unable to render a correct graph

Hi!

After changing the sorting in optimize, train and test (df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d %I-%p')), I'm unable to render a correct graph.

In BitcoinTradingGraph.py I've changed the lines to

def __init__(self, df):
    self.df = df
    # self.df['Time'] = self.df['Date'].apply(
    #     lambda x: datetime.strptime(x, '%Y-%m-%d %I-%p'))
    self.df['Time'] = self.df['Date']
    self.df = self.df.sort_values('Time')

I've compared the dataframe with the original code and the new one and I can't find any difference. Nonetheless, I'm unable to render a correct graph.

Am I missing something here?

Absolute diff or percentege diff for reward calculation

In the reward calculation, the returns used for ratio calculation is absolute diff.
Isn't the 'returns' used should be percentage diff ?

    def _reward(self):
        length = min(self.current_step, self.forecast_len)
        returns = np.diff(self.net_worths[-length:])

Create DataLoader class to incorporate multiple data sets

The Alpha Vantage web service provides real-time and historical equity, index, and crypto data.

The service is free, requiring only a free, email registration. The service provides daily, weekly, and monthly history for both domestic and international markets, with up to 20 years of history.

For daily data, adjusted close prices are available to account for dividends and splits so no more manual adjustment anymore:-)

The service can also provide real-time price bars at a resolution of 1 minute or higher, for up to 10 recent days.

https://www.alphavantage.co/

There is a convienent Python wrapper.
https://github.com/RomelTorres/alpha_vantage

I can offer help, if contributions are welcome

tensorflow.python.framework.errors_impl.NotFoundError: Failed to create a directory: ./tensorboard\PPO2_1; No such file or directory

Hi. I am trying to get the whole thing to work. It's my first time using Python and Tensorflow so I apologize if this is a rookie mistake :) .
Issue is when I run python ./optimize.py I get the following error for each trial.

[W 2019-06-11 18:55:51,636] Setting status of trial#13 as TrialState.FAIL because of the following error: NotFoundError()
Traceback (most recent call last):
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\optuna\study.py", line 399, in _run_trial
    result = func(trial)
  File "./optimize.py", line 88, in optimize_agent
    model.learn(evaluation_interval)
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\stable_baselines\ppo2\ppo2.py", line 273, in learn
    with SetVerbosity(self.verbose), TensorboardWriter(self.graph, self.tensorboard_log, tb_log_name, new_tb_log) \
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\stable_baselines\common\base_class.py", line 693, in __enter__
    self.writer = tf.summary.FileWriter(save_path, graph=self.graph)
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\summary\writer\writer.py", line 367, in __init__
    filename_suffix)
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\summary\writer\event_file_writer.py", line 67, in __init__
    gfile.MakeDirs(self._logdir)
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\lib\io\file_io.py", line 442, in recursive_create_dir
    recursive_create_dir_v2(dirname)
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\lib\io\file_io.py", line 458, in recursive_create_dir_v2
    pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(path), status)
  File "C:\Users\Projects\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Failed to create a directory: ./tensorboard\PPO2_1; No such file or directory```

No agents directory will cause training to fail

https://github.com/notadamking/Bitcoin-Trader-RL/blob/5407d309526a3e3fae4f353d249aceec85b6b019/train.py#L75

FileNotFound due to AGENTS parent directory missing.
Solution is to import os and check for the directory before saving. This solved the issues for me.

import os if not os.path.exists('./agents'): os.mkdir('./agents')

Pre-trained Sortino and Profit models

Would it be possible to get the model that was referenced in the article making 850% profit?
Thanks

Making training/testing deterministic

Do you guys know how to make training / testing deterministic, e.g. train / test with reproducible results? There is a seed parameter in learn method, however, I don't think it works properly. Even if I set it to constant, the trained model comes out different every time. For testing, predict method has deterministic parameter, that seems to work as intended.

State missing when testing recurrent policy

Hello,

It seems that you are not feeding the state to the LSTM policy when testing it, this can be quite problematic.
Please see documentation for an example ;)

Is your reward formula correct?

In the step method of the environment, you calculate reward as:
reward = self.net_worth

I think it should be:
reward = self.net_worth - self.previous_net_worth

It is because the reward is for the current step and the net_worth is for all existing steps.

Running optymize.py - error

Hello. When I try to run optymize.py I got this error:

File "./optimize.py", line 16, in
from stable_baselines.common.policies import MlpLnLstmPolicy
File "C:\Users\admin\Desktop\envi\my_env\lib\site-packages\stable_baselines_init_.py", line 4, in
from stable_baselines.ddpg import DDPG
File "C:\Users\admin\Desktop\envi\my_env\lib\site-packages\stable_baselines\ddpg_init_.py", line 1, in
from stable_baselines.ddpg.ddpg import DDPG
File "C:\Users\admin\Desktop\envi\my_env\lib\site-packages\stable_baselines\ddpg\ddpg.py", line 12, in
from mpi4py import MPI
ImportError: DLL load failed: Nie można odnaleźć określonego modułu.

Trying now on Bitmex Live Market

Hello @notadamking Im starting my test now on live market and want to ask you if the first trade is done from Bitcoin to USD or Inverse ?

As you know the Bitmex have the Buy and Sell possibility.

I only need to know if my start balance is in USD or Bitcoin ?

Invalid Date sorting causes look-ahead bias in trading strategy

@notadamking There may be a potential issue with the code. In test.py when you load csv file and sort by Date, the order of time stamp becomes invalid due to time stamp format in coinbase_hourly.csv.
For example after sort by Date:
16 2019-05-16 09-AM BTCUSD 7978.68 ... 7974.62 624.01 5005531.36
4 2019-05-16 09-PM BTCUSD 7668.02 ... 7915.52 1285.55 10056697.28
15 2019-05-16 10-AM BTCUSD 7974.62 ... 7849.61 2038.99 15997694.17
3 2019-05-16 10-PM BTCUSD 7915.52 ... 7822.55 985.91 7790640.90
14 2019-05-16 11-AM BTCUSD 7849.61 ... 7862.64 759.09 5985568.43
2 2019-05-16 11-PM BTCUSD 7822.55 ... 7878.96 780.10 6131121.66
25 2019-05-16 12-AM BTCUSD 8203.32 ... 8339.96 2080.89 17253110.58
13 2019-05-16 12-PM BTCUSD 7862.64 ... 7793.87 992.68 7788906.08

You see the problem? AM/PM and so on

Issue run optimize

Hello! I installed all requirements on my Ubuntu machine, got no error message. But cannot run optimize.

Following message:

`(Bitcoin-Trader-RL) (base) root@vmanager6003:~/Bitcoin-Trader-RL# python ./optimize.py
Traceback (most recent call last):
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./optimize.py", line 16, in
from stable_baselines.common.policies import MlpLnLstmPolicy
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/stable_baselines/init.py", line 1, in
from stable_baselines.a2c import A2C
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/stable_baselines/a2c/init.py", line 1, in
from stable_baselines.a2c.a2c import A2C
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/stable_baselines/a2c/a2c.py", line 6, in
import tensorflow as tf
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/init.py", line 24, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/init.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/root/Bitcoin-Trader-RL/Bitcoin-Trader-RL/lib/python3.7/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
`

Debugging the sortino optimizer, sortino rate not always calculated.

I am trying to get a grasp on how the optimizer works and which values are given/calculated etc.
I Added some debugging, and see that the sortino rate is not always calculated.
This happens at least when the first couple of steps are non-trades. See log below.

I also have a question about the rate:
the code specifies that only finite rates are measured. That is NaN and "infinite".
NaN i assume the rate cannot be calculated. But infinite is also set to zero. Is this correct?
What does an calculated "infinite" sortino score mean?

`
legend:
--- = sale

+++ = buy

b = balance

nw = net worth

CS = calculating Sortino Rate

reward = sortino rate, total_reward, (length = min(self.current_step, self.forecast_len)
`

DEBUG:main:Trainng length
INFO:main: Reward strategy : sortino
INFO:main: Input data file : data/small_corrected_hourly.csv
INFO:main: Parames db file : sqlite:///params.db
INFO:main: n_jobs = 1
INFO:main: n_trials = 30
INFO:main: n_test_episodes = 3
INFO:main: n_evaluations = 4
INFO:main: Total number of records : 2612
INFO:main: Number of training recs : 2612
INFO:main: Num. of evaluation recs : 2089
INFO:main:forecast_length = 2
INFO:main:convidence_intv = 0.9407857951645567
INFO:main:n_steps = 82
INFO:main:gamma = 0.9910742592048303
INFO:main:learning_rate = 0.0001463509003510531
INFO:main:ent_coef = 8.4767762718177e-08
INFO:main:cliprange = 0.9950675592874643
INFO:main:noptepochs = 1
INFO:main:lam = 0.9304853592231145

DEBUG:env.BitcoinTradingEnv: b = 10000.00 nw = 10000.00
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv:--- $$$ 0.00
DEBUG:env.BitcoinTradingEnv: b = 10000.00 nw = 10000.00
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv: b = 10000.00 nw = 10000.00
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv:--- $$$ 0.00
DEBUG:env.BitcoinTradingEnv: b = 10000.00 nw = 10000.00
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv: b = 10000.00 nw = 10000.00
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv:+++ $$$ 3333.33
DEBUG:env.BitcoinTradingEnv: b = 6666.67 nw = 6666.67
DEBUG:env.BitcoinTradingEnv:CS
DEBUG:env.BitcoinTradingEnv: reward = 0 0 (2)
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv:+++ $$$ 2222.22
DEBUG:env.BitcoinTradingEnv: b = 4444.44 nw = 4444.44
DEBUG:env.BitcoinTradingEnv:CS
DEBUG:env.BitcoinTradingEnv: reward = 0 0 (2)
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv:--- $$$ 5496.85
DEBUG:env.BitcoinTradingEnv: b = 9941.29 nw = 9941.29
DEBUG:env.BitcoinTradingEnv:CS
DEBUG:env.BitcoinTradingEnv: reward = 0 0 (2)
DEBUG:env.BitcoinTradingEnv:

DEBUG:env.BitcoinTradingEnv:+++ $$$ 2485.32
DEBUG:env.BitcoinTradingEnv: b = 7455.97 nw = 7455.97
DEBUG:env.BitcoinTradingEnv:CS
DEBUG:env.BitcoinTradingEnv: reward = 0 0 (2)
DEBUG:env.BitcoinTradingEnv:

Understanding SARIMAX prediction in _next_observation

First, thanks for making your code open source and for the great medium article. This is really great stuff.

I'm just going through your code and trying to understand it. In your BitcoinTradingEnv.py script, under the _next_observation method, you have

past_df = self.stationary_df['Close'][:self.current_step + self.forecast_len + 1]
forecast_model = SARIMAX(past_df.values, enforce_stationarity=False)

If I understand this correctly, you are indexing all your close price data up to current_step + forecast_len + 1 and then training the SARIMAX model on this data. But doesn't that mean you're looking ahead (into the future) by forecast_len days? I would have thought that you can only use data that is available up to the current time step, because we don't know the data from the future. So something like

past_df = self.stationary_df['Close'][:self.current_step + 1]
forecast_model = SARIMAX(past_df.values, enforce_stationarity=False)

Apologies if I am missing something obvious here.

Allow user-defined prediction models to replace SARIMAX with LSTM, FB Prophet, etc.

Instead of using the SARIMAX model's predictions to be fed as a feature / observation into the agent at every time step, maybe using the predictions of a separately trained LSTM may perform better. I've found that LSTM's trained on stationary time series usually perform ok, in that their prediction "direction" (up or down) is generally correct. Also, the slope (change in log returns with respect to time) of the LSTM prediction can be used as an additional input feature as an indicator for the severity of the predicted change (will the price jump up or down suddenly?).

notadamking / rltrader Goto Github PK

rltrader's People

Contributors

Stargazers

Watchers

Forkers

rltrader's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs