yacoubb / stock-trading-ml Goto Github PK

View Code? Open in Web Editor NEW

592.0 592.0 255.0 25 KB

A stock trading bot that uses machine learning to make price predictions.

License: GNU General Public License v3.0

Python 100.00%

deep-learning lstm machine-learning neural-network price-predictions stock-trading time-series

stock-trading-ml's Introduction

Stock Trading with Machine Learning

Overview

A stock trading bot that uses machine learning to make price predictions.

Requirements

Python 3.5+
alpha_vantage
pandas
numpy
sklearn
keras
tensorflow
matplotlib

Documentation

Blog Post

Medium Article

Train your own model

Clone the repo
Pip install the requirements pip install -r requirements.txt
Save the stock price history to a csv file python save_data_to_csv.py --help
Edit one of the model files to accept the symbol you want
Edit model architecture
Edit dataset preprocessing / history_points inside util.py
Train the model python tech_ind_model.py or python basic_model.py
Try the trading algorithm on the newly saved model python trading_algo.py

License

GPL-3.0

stock-trading-ml's People

Contributors

Stargazers

Watchers

Forkers

leononelove85 lorentz-wu jingmouren huglittlecat88 felix-weizman-deel bhoang ibrahimkaya754 nagappankv richiedinc whidbey dfatlund mjdube samarth-sangam ossiemarks azleal darrellbest xiemeigongzi88 makedirectory joeljude lejoys kamenshah matthewstidham adonunes sierra-golf aaronlewis04 aarush3002 nkipa aladdin-xiii mapminded zhbmsqx pruittinvestmentsllc aaauaaau k32l mobiletainment eddyleelin mrchaos deltmd eshnil2000 skopimos kwmt chaowen112 joaquinfdez tnet seysamas mhaneferd bearcat87 twoods94 colptha damonclifford fieldofsheep hashtag32 tojewel praveshk15 hcwdavid emiraraujo tomaszlakota ranshul191 esobrinho luwening rafaelquirino livingtrades83 iofirag iamparthshah chrisice cahersiveen mina733 ofesad zhangyan612 pradeep119 novta zeusxx7 cameronscrosby yadukrishnan1 mehuleo kevinwheeapp scsix bedros veesamkrao joe-l-bright rischanlab chensstudio dudals3844 leandrolamenha tewei0328 struth-rourke utsasrg bthecorgi jgwmaxwell jeffrey-dot-li philipcori robertklee minstermind moumita-das-7019 censomin jmquint00 aditya-putta carderm mochinko7 zuwannn gershgorin

stock-trading-ml's Issues

Stock splits

Hi Yacoubb,
Just wanted to give you a heads up that it doesn't appear your code accounts for stocks splits at all. You probably didnt encounter this since MSFT hasnt had a split in the last 500 days, but for other stocks this may be an issue.
you can actually see one such split in this graph you posted by looking at the huge price drop circa day 900:

Might be good to use alpha_vantage's "daily_adj" (which i'm aware you've included), and check for any stock splits in the "split coefficient" column and adjust all the ohlcv values accordingly.

cheers

Tensorflow - set_random_seed to be replaced for Tensorflow v2.0

The set_random_seed is now replaced by random.

Hence the code should be updated to

from tensorflow import random
random.set_seed(4)

Look-ahead cheating for current day stock values

When predicting tomorrow's open stock value in a real world situation, the only value you would have for today would be the stock's open price. Currently the model assumes knowledge of all open, high, low, close, volume information which likely gives it an unfair advantage.

New High in Training Data

Hi,

I see this repo wasn't updated for several months but wanted to see if I can get some help from you on your program.

Prediction is not working (generates below highest price in training data when there's new high price in test data) when new high happens in test data.

Also wanted to check with you whether I can ask other questions on your program. I am not familiar with machine learning at all.

Regards,
Brian

Requirements.txt is out of date

Tensorflow and keras download the wrong version. I will open a PR to fix this issue.

save_data_to_csv.py issue

Hi,
I'm new to python and can't seem to get this code working, this is the error:

usage: save_data_to_csv.py [-h] symbol {intraday,daily,daily_adj}
save_data_to_csv.py: error: the following arguments are required: symbol, time_window

it doesn't quite make sense why its not working because the code defines both and can be seen here:

if time_window == 'intraday':
data, meta_data = ts.get_intraday(
symbol='MSFT', interval='1min', outputsize='full')
elif time_window == 'daily':
data, meta_data = ts.get_daily(symbol, outputsize='full')
elif time_window == 'daily_adj':
data, meta_data = ts.get_daily_adjusted(symbol, outputsize='full')

If anyone is able to help that would be great.

Thank you

You are an idiot

I'm sorry to be harsh, but you are an idiot. Your method for predicting stock prices is wrong on so many levels. If it were this easy then everyone would be doing it. Do you think machine learning is magic and that you can predict price purely by looking at historical prices? To use an old adage, try getting in your car and driving using only the rear view mirror and see how you do.

I emplore you to take down your blog post, get some real experience with ML before some moron believes you and wastes their money trying to implement this for real.

ValueError: Data cardinality is ambiguous:

I get this error when running the trading_algo.py:

`ValueError Traceback (most recent call last)
in ()
----> 1 model.predict([[ohlcv], [ind]])

3 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py in init(self, x, y, sample_weights, sample_weight_modes, batch_size, epochs, steps, shuffle, **kwargs)
280 label, ", ".join(str(i.shape[0]) for i in nest.flatten(data)))
281 msg += "Please provide data which shares the same first dimension."
--> 282 raise ValueError(msg)
283 num_samples = num_samples.pop()
284 `
ValueError: Data cardinality is ambiguous:
x sizes: 50, 1
Please provide data which shares the same first dimension.

Can anyone advise what the issue is?

NameError: name 'MSFT_daily' is not defined

I get this error when running the basic_model.py:

File "basic_model.py", line 20, in
ohlvc_histories, _, next_day_open_values, unscaled_y, y_normaliser = MSFT_daily.csv
NameError: name 'MSFT_daily' is not defined

Can anyone advise what the issue is? I have already extracted the MSFT_daily.csv to the project folder.

next day closing instead next day open

Hi Yacoubb,
your project seems very interesting , I updated the indexes to use the closes and the results are totally different and way much worst , also included other trading indicators and fixed the macd the result is always the same opens give better results than closes… however opens are predicted based on high lows and closes , volume of the very same day …something is not right ?
cheers

Jupiter notebook on Google Colaboratory?

I'd like to play with your project on Colab. I can't seem to import it. Any advice? Thank you

droping the IPO day data

when you intend to drop the IPO day data with data = data.drop(0, axis=0), you're actually dropping the most recent data, not the oldest.

Should instead be data = data.drop(data.shape[0] - 1, axis=0)

Smile.exe

import com.atlassian.jira.component.ComponentAccessor
import com.atlassian.jira.issue.MutableIssue

def project = ComponentAccessor.projectManager.getProjectObjByKey("SCRUM")
def user = ComponentAccessor.jiraAuthenticationContext.loggedInUser

MutableIssue issue = ComponentAccessor.issueFactory.issue
issue.projectObject = project
issue.summary = "Demo issue created from the script"
issue.issueTypeId = 10102
issue.assignee = user
ComponentAccessor.issueManager.createIssueObject(user, issue)

Tensorflow 'set_random_seed' issue : SOLVED

Make the following changes in "basic_model.py" and "tech_ind_model.py" :

FROM

from tensorflow import set_random_seed
set_random_seed(4)

import tensorflow
tensorflow.random.set_seed(4)

error

(venv) C:\Users\sander\PycharmProjects\autotrader>python save_data_to_csv.py
usage: save_data_to_csv.py [-h] symbol {intraday,daily,daily_adj}
save_data_to_csv.py: error: the following arguments are required: symbol, time_window

How to increase "prediction" timeframe?

Hi yacoub, how do I increase how many days or timeframe in advance the model prints out at the end?

Problem with seed of latest version of tensorflow

when seeding the random, you need to seed like that:

import tensorflow as tf
tf.random.set_seed()

Reference:
https://stackoverflow.com/questions/58638701/importerror-cannot-import-name-set-random-seed-from-tensorflow-c-users-po

tensorflow changes in version 2

Hi yacoubb,

found this issue in your code regarding the files:
basic_model.py
tech_ind_model.py

Does not work anymore like this.
File: basic_model.py
from tensorflow import set_random_seed
set_random_seed(4)

should be replaced with:
import tensorflow
tensorflow.random.set_seed(4)

File: tech_ind_model.py
import tensorflow as tf
from tensorflow import set_random_seed
set_random_seed(4)

should be replaced with:
import tensorflow
tensorflow.random.set_seed(4)

creds.json missing

Hi yacoubb,

thanks a lot for uploading this :-)

As i am an beginner in programming, i struggeled on the cred.json file. Took some research and thinking what is the error. Suggestion: can you expand the readme.me explaining this?

Thanks C.

Exposing future prices in training

Dear Yacoubb,

Your predictions seem too good to be true. I believe you are exposing future prices in training. When you turn on the shuffle option in the fit, it seems it first shuffles and then splits.

model.fit(x=ohlcv_train, y=y_train, batch_size=32, epochs=50, shuffle=True, validation_split=0.1)
So, the 10 % split is not necessarily the last 10 %, but some middle value.

No predicted in graph

Thanks a lot, Yacoub. It's an interesting project.

I want to make it running on Jupyter Notebook but can't make it work yet. I'm new here, still trying...

Under Anaconda, I run your code, and found 2 issues:

1/ set_random_seed() can't be imported, I made a minor update to fix it.
#from tensorflow import set_random_seed
#set_random_seed(4)
import tensorflow
tensorflow.random.set_seed(4)

2/ I run my data but the final graph could not show the predicted line. Here are some of the outputs. Does it mean something wrong with my data?
Using TensorFlow backend.
(4535, 50, 5)
(504, 50, 5)
2020-05-17 00:47:18.540599: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
Train on 4081 samples, validate on 454 samples
Epoch 1/50
4081/4081 [==============================] - 3s 626us/step - loss: nan - val_loss: nan
Epoch 2/50
4081/4081 [==============================] - 2s 537us/step - loss: nan - val_loss: nan
Epoch 3/50

Prediction timeframe extend

This is an amazing project. Thanks for the input. I have made several improvements which I will commit in few days.

Meanwhile, I was wondering how to increase the prediction timeframe. I trained the model on historical_inputs = 5. Now when I predict the model ohlcv_histories_normalised = np.array([data_normalised[i:i + history_points].copy() for i in range(len(data_normalised) - history_points)]) line in code reduces the dimension/prediction timeframe by historical_inputs = 5 days.

For eg: If my dataframe is from Feb 5 to Feb 20, it just gives the prediction for Feb 5 until Feb 15. How to get the prediction for the current day/timeframe, i.e.: Feb 15 to Feb 20?

A suggestion for the save_data_to_csv.py

I have made a bit of a change that might help if someone were to want to pull more than one stock at a time.
simple adding a 'Stock symbols.txt' file with one stock symbol on each line. This will also create a simple {date}.txt file that will help track the number of API calls per day. I also added a time.sleep(13) to make sure the API is not abused. There is a limit of 500 calls per day, and 5 API requests per minute. See the attached files for the updates.
08-01-2020.txt
[api_data_to_csv.txt](https://github.com/yacoubb/stock-trading-ml/files/5011748/api_data_to_csv.txt
Stock symbols.txt
)

rename api_data_to_csv.txt to save_data_to_csv.py

ValueError

I am getting the following error when I run "python trading_algo.py"

2021-02-13 10:43:34.165315: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2021-02-13 10:43:36.611898: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-13 10:43:36.613079: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-02-13 10:43:36.635749: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-02-13 10:43:36.635814: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (821e3f25af0b): /proc/driver/nvidia/version does not exist
2021-02-13 10:43:36.636356: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-13 10:43:37.423855: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-02-13 10:43:37.424339: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2299995000 Hz
Traceback (most recent call last):
File "trading_algo.py", line 37, in
predicted_price_tomorrow = np.squeeze(y_normaliser.inverse_transform(model.predict([[ohlcv], [ind]])))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1608, in predict
steps_per_execution=self._steps_per_execution)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 1112, in init
model=model)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 274, in init
_check_data_cardinality(inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 1529, in _check_data_cardinality
raise ValueError(msg)
ValueError: Data cardinality is ambiguous:
x sizes: 50, 1
Make sure all arrays contain the same number of samples.

Stock Trading Code

Stuck on step 3

Hi! I'm really new to python and stuck on what to do after step 3.
I installed all the requirements, however I don't need AlphaVantage as I have my historical data of the stock already. Could you help me after step 3 as I get the following error & tell me how to not use AlphaVantage and only run the project with a pre-existing csv file?

 File "save_data_to_csv.py", line 22
    data.to_csv(f'./{symbol}_{time_window}.csv')
                                              ^
SyntaxError: invalid syntax

prediction timeframe

hello there!

i am quite a newbie to ML and wanted to ask if there is any way I can adjust the prediction timeframe to more than just one day in advance. the prediction accuracy is not important in this case.. it is just to do some additional research in combination with other tools.

thanks in advance
andreas