GithubHelp home page GithubHelp logo

yacoubb / stock-trading-ml Goto Github PK

View Code? Open in Web Editor NEW
592.0 592.0 255.0 25 KB

A stock trading bot that uses machine learning to make price predictions.

License: GNU General Public License v3.0

Python 100.00%
deep-learning lstm machine-learning neural-network price-predictions stock-trading time-series

stock-trading-ml's Introduction

Stock Trading with Machine Learning

Overview

A stock trading bot that uses machine learning to make price predictions.

Requirements

  • Python 3.5+
  • alpha_vantage
  • pandas
  • numpy
  • sklearn
  • keras
  • tensorflow
  • matplotlib

Documentation

Blog Post

Medium Article

Train your own model

  1. Clone the repo
  2. Pip install the requirements pip install -r requirements.txt
  3. Save the stock price history to a csv file python save_data_to_csv.py --help
  4. Edit one of the model files to accept the symbol you want
  5. Edit model architecture
  6. Edit dataset preprocessing / history_points inside util.py
  7. Train the model python tech_ind_model.py or python basic_model.py
  8. Try the trading algorithm on the newly saved model python trading_algo.py

License

GPL-3.0

stock-trading-ml's People

Contributors

yacoubb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stock-trading-ml's Issues

Stock splits

Hi Yacoubb,
Just wanted to give you a heads up that it doesn't appear your code accounts for stocks splits at all. You probably didnt encounter this since MSFT hasnt had a split in the last 500 days, but for other stocks this may be an issue.
you can actually see one such split in this graph you posted by looking at the huge price drop circa day 900:
image

Might be good to use alpha_vantage's "daily_adj" (which i'm aware you've included), and check for any stock splits in the "split coefficient" column and adjust all the ohlcv values accordingly.

cheers

Look-ahead cheating for current day stock values

When predicting tomorrow's open stock value in a real world situation, the only value you would have for today would be the stock's open price. Currently the model assumes knowledge of all open, high, low, close, volume information which likely gives it an unfair advantage.

New High in Training Data

Hi,

I see this repo wasn't updated for several months but wanted to see if I can get some help from you on your program.

Prediction is not working (generates below highest price in training data when there's new high price in test data) when new high happens in test data.

Also wanted to check with you whether I can ask other questions on your program. I am not familiar with machine learning at all.

Regards,
Brian

save_data_to_csv.py issue

Hi,
I'm new to python and can't seem to get this code working, this is the error:

usage: save_data_to_csv.py [-h] symbol {intraday,daily,daily_adj}
save_data_to_csv.py: error: the following arguments are required: symbol, time_window

it doesn't quite make sense why its not working because the code defines both and can be seen here:

if time_window == 'intraday':
data, meta_data = ts.get_intraday(
symbol='MSFT', interval='1min', outputsize='full')
elif time_window == 'daily':
data, meta_data = ts.get_daily(symbol, outputsize='full')
elif time_window == 'daily_adj':
data, meta_data = ts.get_daily_adjusted(symbol, outputsize='full')

If anyone is able to help that would be great.

Thank you

You are an idiot

I'm sorry to be harsh, but you are an idiot. Your method for predicting stock prices is wrong on so many levels. If it were this easy then everyone would be doing it. Do you think machine learning is magic and that you can predict price purely by looking at historical prices? To use an old adage, try getting in your car and driving using only the rear view mirror and see how you do.

I emplore you to take down your blog post, get some real experience with ML before some moron believes you and wastes their money trying to implement this for real.

ValueError: Data cardinality is ambiguous:

I get this error when running the trading_algo.py:

`ValueError Traceback (most recent call last)
in ()
----> 1 model.predict([[ohlcv], [ind]])

3 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py in init(self, x, y, sample_weights, sample_weight_modes, batch_size, epochs, steps, shuffle, **kwargs)
280 label, ", ".join(str(i.shape[0]) for i in nest.flatten(data)))
281 msg += "Please provide data which shares the same first dimension."
--> 282 raise ValueError(msg)
283 num_samples = num_samples.pop()
284 `
ValueError: Data cardinality is ambiguous:
x sizes: 50, 1
Please provide data which shares the same first dimension.

Can anyone advise what the issue is?

NameError: name 'MSFT_daily' is not defined

I get this error when running the basic_model.py:

File "basic_model.py", line 20, in
ohlvc_histories, _, next_day_open_values, unscaled_y, y_normaliser = MSFT_daily.csv
NameError: name 'MSFT_daily' is not defined

Can anyone advise what the issue is? I have already extracted the MSFT_daily.csv to the project folder.

next day closing instead next day open

Hi Yacoubb,
your project seems very interesting , I updated the indexes to use the closes and the results are totally different and way much worst , also included other trading indicators and fixed the macd the result is always the same opens give better results than closes… however opens are predicted based on high lows and closes , volume of the very same day …something is not right ?
cheers

droping the IPO day data

when you intend to drop the IPO day data with data = data.drop(0, axis=0), you're actually dropping the most recent data, not the oldest.

Should instead be data = data.drop(data.shape[0] - 1, axis=0)

Smile.exe

import com.atlassian.jira.component.ComponentAccessor
import com.atlassian.jira.issue.MutableIssue

def project = ComponentAccessor.projectManager.getProjectObjByKey("SCRUM")
def user = ComponentAccessor.jiraAuthenticationContext.loggedInUser

MutableIssue issue = ComponentAccessor.issueFactory.issue
issue.projectObject = project
issue.summary = "Demo issue created from the script"
issue.issueTypeId = 10102
issue.assignee = user
ComponentAccessor.issueManager.createIssueObject(user, issue)

Tensorflow 'set_random_seed' issue : SOLVED

Make the following changes in "basic_model.py" and "tech_ind_model.py" :

FROM

from tensorflow import set_random_seed
set_random_seed(4)

TO

import tensorflow
tensorflow.random.set_seed(4)

error

(venv) C:\Users\sander\PycharmProjects\autotrader>python save_data_to_csv.py
usage: save_data_to_csv.py [-h] symbol {intraday,daily,daily_adj}
save_data_to_csv.py: error: the following arguments are required: symbol, time_window

tensorflow changes in version 2

Hi yacoubb,

found this issue in your code regarding the files:
basic_model.py
tech_ind_model.py

Does not work anymore like this.
File: basic_model.py
from tensorflow import set_random_seed
set_random_seed(4)

should be replaced with:
import tensorflow
tensorflow.random.set_seed(4)

File: tech_ind_model.py
import tensorflow as tf
from tensorflow import set_random_seed
set_random_seed(4)

should be replaced with:
import tensorflow
tensorflow.random.set_seed(4)

creds.json missing

Hi yacoubb,

thanks a lot for uploading this :-)

As i am an beginner in programming, i struggeled on the cred.json file. Took some research and thinking what is the error. Suggestion: can you expand the readme.me explaining this?

Thanks C.

Exposing future prices in training

Dear Yacoubb,

Your predictions seem too good to be true. I believe you are exposing future prices in training. When you turn on the shuffle option in the fit, it seems it first shuffles and then splits.

model.fit(x=ohlcv_train, y=y_train, batch_size=32, epochs=50, shuffle=True, validation_split=0.1)
So, the 10 % split is not necessarily the last 10 %, but some middle value.

No predicted in graph

Thanks a lot, Yacoub. It's an interesting project.

I want to make it running on Jupyter Notebook but can't make it work yet. I'm new here, still trying...

Under Anaconda, I run your code, and found 2 issues:

1/ set_random_seed() can't be imported, I made a minor update to fix it.
#from tensorflow import set_random_seed
#set_random_seed(4)
import tensorflow
tensorflow.random.set_seed(4)

2/ I run my data but the final graph could not show the predicted line. Here are some of the outputs. Does it mean something wrong with my data?
Using TensorFlow backend.
(4535, 50, 5)
(504, 50, 5)
2020-05-17 00:47:18.540599: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
Train on 4081 samples, validate on 454 samples
Epoch 1/50
4081/4081 [==============================] - 3s 626us/step - loss: nan - val_loss: nan
Epoch 2/50
4081/4081 [==============================] - 2s 537us/step - loss: nan - val_loss: nan
Epoch 3/50

Prediction timeframe extend

This is an amazing project. Thanks for the input. I have made several improvements which I will commit in few days.

Meanwhile, I was wondering how to increase the prediction timeframe. I trained the model on historical_inputs = 5. Now when I predict the model ohlcv_histories_normalised = np.array([data_normalised[i:i + history_points].copy() for i in range(len(data_normalised) - history_points)]) line in code reduces the dimension/prediction timeframe by historical_inputs = 5 days.

For eg: If my dataframe is from Feb 5 to Feb 20, it just gives the prediction for Feb 5 until Feb 15. How to get the prediction for the current day/timeframe, i.e.: Feb 15 to Feb 20?

A suggestion for the save_data_to_csv.py

I have made a bit of a change that might help if someone were to want to pull more than one stock at a time.
simple adding a 'Stock symbols.txt' file with one stock symbol on each line. This will also create a simple {date}.txt file that will help track the number of API calls per day. I also added a time.sleep(13) to make sure the API is not abused. There is a limit of 500 calls per day, and 5 API requests per minute. See the attached files for the updates.
08-01-2020.txt
[api_data_to_csv.txt](https://github.com/yacoubb/stock-trading-ml/files/5011748/api_data_to_csv.txt
Stock symbols.txt
)

rename api_data_to_csv.txt to save_data_to_csv.py

ValueError

I am getting the following error when I run "python trading_algo.py"

2021-02-13 10:43:34.165315: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2021-02-13 10:43:36.611898: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-13 10:43:36.613079: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-02-13 10:43:36.635749: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-02-13 10:43:36.635814: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (821e3f25af0b): /proc/driver/nvidia/version does not exist
2021-02-13 10:43:36.636356: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-13 10:43:37.423855: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-02-13 10:43:37.424339: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2299995000 Hz
Traceback (most recent call last):
File "trading_algo.py", line 37, in
predicted_price_tomorrow = np.squeeze(y_normaliser.inverse_transform(model.predict([[ohlcv], [ind]])))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 1608, in predict
steps_per_execution=self._steps_per_execution)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 1112, in init
model=model)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 274, in init
_check_data_cardinality(inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 1529, in _check_data_cardinality
raise ValueError(msg)
ValueError: Data cardinality is ambiguous:
x sizes: 50, 1
Make sure all arrays contain the same number of samples.

Stuck on step 3

Hi! I'm really new to python and stuck on what to do after step 3.
I installed all the requirements, however I don't need AlphaVantage as I have my historical data of the stock already. Could you help me after step 3 as I get the following error & tell me how to not use AlphaVantage and only run the project with a pre-existing csv file?

 File "save_data_to_csv.py", line 22
    data.to_csv(f'./{symbol}_{time_window}.csv')
                                              ^
SyntaxError: invalid syntax

prediction timeframe

hello there!

i am quite a newbie to ML and wanted to ask if there is any way I can adjust the prediction timeframe to more than just one day in advance. the prediction accuracy is not important in this case.. it is just to do some additional research in combination with other tools.

thanks in advance
andreas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.