GithubHelp home page GithubHelp logo

jankrepl / deepdow Goto Github PK

View Code? Open in Web Editor NEW
848.0 26.0 134.0 2.3 MB

Portfolio optimization with deep learning.

Home Page: https://deepdow.readthedocs.io

License: Apache License 2.0

Python 100.00%
deep-learning portfolio-optimization finance machine-learning pytorch timeseries markowitz convex-optimization stock-price-prediction wealth-management

deepdow's Introduction

final

codecov Documentation Status PyPI version DOI

deepdow (read as "wow") is a Python package connecting portfolio optimization and deep learning. Its goal is to facilitate research of networks that perform weight allocation in one forward pass.

Installation

pip install deepdow

Resources

Description

deepdow attempts to merge two very common steps in portfolio optimization

  1. Forecasting of future evolution of the market (LSTM, GARCH,...)
  2. Optimization problem design and solution (convex optimization, ...)

It does so by constructing a pipeline of layers. The last layer performs the allocation and all the previous ones serve as feature extractors. The overall network is fully differentiable and one can optimize its parameters by gradient descent algorithms.

deepdow is not ...

  • focused on active trading strategies, it only finds allocations to be held over some horizon (buy and hold)
    • one implication is that transaction costs associated with frequent, short-term trades, will not be a primary concern
  • a reinforcement learning framework, however, one might easily reuse deepdow layers in other deep learning applications
  • a single algorithm, instead, it is a framework that allows for easy experimentation with powerful building blocks

Some features

  • all layers built on torch and fully differentiable
  • integrates differentiable convex optimization (cvxpylayers)
  • implements clustering based portfolio allocation algorithms
  • multiple dataloading strategies (RigidDataLoader, FlexibleDataLoader)
  • integration with mlflow and tensorboard via callbacks
  • provides variety of losses like sharpe ratio, maximum drawdown, ...
  • simple to extend and customize
  • CPU and GPU support

Citing

If you use deepdow (including ideas proposed in the documentation, examples and tests) in your research please make sure to cite it. To obtain all the necessary citing information, click on the DOI badge at the beginning of this README and you will be automatically redirected to an external website. Note that we are currently using Zenodo.

deepdow's People

Contributors

atk71 avatar guillermop98 avatar jankrepl avatar louisoutin avatar mirca avatar shivanirathod126 avatar turmeric-blend avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepdow's Issues

Dealing with sum(w) != 1

With the convex optimization it could happen that the solver does not find solution and then it results in weights not summing up to one (sometimes drastically different).

Possible solutions

  • Postprocessing layer (rescales to 1)
  • Loss punishing incorrect w

Implement __sub__

Currently one cannot just do -SomeLoss(). Of course it could be hacked by doing (-1) * SomeLoss(). We want to implement the first syntax via __sub__.

Additional augmentations

Rather than reinventing the wheel one could just use torchvision transforms https://pytorch.org/docs/stable/torchvision/transforms.html

  • Compose (already recreated in deepdow)
  • RandomApply - apply all with some probability
  • RandomChoice - apply exactly one but at random
  • RandomOrder - apply all but in random order
  • 1Dwarping - Affine would be a special case, one could in theory have any increasing function (derivative > 0)
  • RandomAffine - scaling and translation along the y axis (lookback) could be a brilliant augmentation for deepdow tensors
  • RandomHorizontalFlip - flipping the time flow, probably super confusing if one wants to pic up mean reversion
  • Normalize - a must together with some helper function that computes means, stds in the training set. However, it still assumes that the time series is stationary.
  • RandomErasing - (similar to the current Dropout however it is contiguous regions)

Additionally, torchvision might be also helpful in other tasks (see #39)

The clear downside is introducing yet another dependency. Additionally, one might argue that it is better to go all the way and use imgaug, albumentation,...

Other nonvision augmentations:

Make raw_to_Xy more transparent

Currently, deepdow.utils.raw_to_Xy does a lot of magic inside and outputs only the bare minimum for training:

  • X
  • timestamps
  • y
  • asset_names
  • indicators

Would be nice to have some debug mode that returns more.

raw_to_Xy doesn't handle gaps in data

raw_to_Xy appears to handle regular gaps in data (e.g. weekend days) but cannot handle irregular gaps such as holidays.

When fed trading data similar to the example at https://deepdow.readthedocs.io/en/latest/source/data_loading.html but covering an entire trading year it get out of sync on every holiday. E.g. a Monday that would typically trade but does not on a holiday such as Jan 20, 2020.

The result is that the assertion assert timestamps[0] == raw_df.index[lookback] fails.

This, and likely other data formatting issues, causes an error when executing history = run.launch(30) which is RuntimeError: mat1 and mat2 shapes cannot be multiplied

Numpy metrics

Currently all metrics input and output torch.tensors...limiting!?

Generating synthetic data

Apart from generating iid sequences one can do a lot of different things. Just need to pay attention to using too many external dependencies.

Statistical models

  • AR
  • ARMA
  • GARCH
  • VAR

Signal processing

State space models (both discrete and continuous latent space)

  • HMM

Extra linefeed in Epoch reporting

With each epoch there is an extra linefeed inserted when reporting metrics in Jupyter. E.g. notice in the below screenshot the gap between each line. It grows again in epoch 3 and so on.

image

Fix benchmarks

  • possibility to fix problem size in constructor
  • returns channel
  • feed batches into cvxpy

Improve visualize module

Some ideas below

  • The visualize model would use some helper function that inputs network and dataloader and returns DataFrame
  • weight_image
  • Include in documentation

Cleanup docs

Currently, there are a lot of typos, poorly written or unfinished sentences, etc...

Weight normalization allocator

Would be cool to create an allocator that just learns a single weight per each asset. To make sure all the weights sum up to one can only consider positive weights and then divide them by the sum.

MLflow bumpup metric

For determinstic benchmarks metrics can be just copied from previous step rather than recomputed

Argmax allocator

Probably not possible since we would encounter zero gradients

Add python_requires

Assert Python version via python_requires. Should correspond to what is tested (.travis.yml)

Clipping in gradient_wrt_input

Currently, we implement one "Explainable" algorithm in deepdow.explain.gradient_wrt_input. The problem is that we do not restrict the values the input can have. One solution would be to implement some projection/clipping logic that takes place after each optimizer step and thus forces the values to be in a given range.

See https://arxiv.org/pdf/1702.04782.pdf

Example: Learning NumericalMarkowitz parameters

Would be nice to show, how deepdow is able to directly learn or have a network predictor of any input variables of the deepdow.layers.NumericalMarkowitz

It might be a good idea to use real data (i.e. yfinance), however one needs to be careful about the example running too long (both CI and readthedocs need to run it)

AssertionError: when using BachelierNet

I'm able to get the out-of-the box examples to execute successfully (getting_started and iid) when using the generated data but when using differenty toy datasets I get an AssertionError in the cvxcpy module.

/opt/conda/lib/python3.6/site-packages/cvxpy/cvxcore/python/canonInterface.py in nonzero_csc_matrix(A)
    162     # this function returns (rows, cols) corresponding to nonzero entries in
    163     # A; an entry that is explicitly set to zero is treated as nonzero
--> 164     assert not np.isnan(A.data).any()
    165 
    166     # scipy drops rows, cols with explicit zeros; use nan as a sentinel

AssertionError:

Steps to reproduce:

  1. Start with getting_started.ipynb
  2. Replace the data generation logic with loading data as described in this issue (I've used this as well as larger data sets)
  3. Replace the Network Definition with:
from deepdow.nn import BachelierNet

n_channels = X.shape[1]
lookback = X.shape[2]
n_assets = X.shape[3]
max_weight = 0.5
hidden_size = 32

network = BachelierNet(n_channels, n_assets, hidden_size=hidden_size, max_weight=max_weight)

print(network)

Same error occurs even if reducing channels to 1, increasing number of samples, keeping lookback, gap, horizion small (5,0,1).

raw2Xy

Complete preprocessor

Custom portfolio benchmark

It would be nice to have a benchmark that is just some predefined portfolio. One would construct it by passing all the weights.

something about the turnover constraint

These days I have been using your deepdow package to do some experiments about portfolio optimization. Thanks for your great work!
But I have a problem here. I want to add the turnover rate constraint into the optimization. To achieve that, every round the network run over the training optimization, I have to keep the weights that has been calculated so that in the next round I can assure the new weight calculated won’t be too far from the previous one.
So I want to ask you that if there is some way I can save the weight each time the network calculated during the training process?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.