GithubHelp home page GithubHelp logo

firmai / deltapy Goto Github PK

View Code? Open in Web Editor NEW
527.0 527.0 53.0 1.51 MB

DeltaPy - Tabular Data Augmentation (by @firmai)

Home Page: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3582219

Python 10.84% Jupyter Notebook 89.16%
augmentation data-augmentation data-science feature-engineering feature-extraction finance machine-learning tabular-data time-series

deltapy's Introduction

Hi there πŸ‘‹

I'm Derek, a professor at NYU teaching Machine Learning in Financial Engineering

  • My interests are in synthetic data generation, agent-based simulators, and asset management using machine learning.
  • I have worked on projects at HSBC, G-Research, Alan Turing Insitute, Oxford-MAN Institute and other large quantitative funds.

Β 
Spotify

Since joining Github, I pushed 7931 commits, opened 198 issues, received 20665 stars across 67 personal projects and contributed to 32 public repositories.

Packages

Everything listed here is available under Unlicense

My research has been used by large institutional banks and quantitative hedge funds (see SSRN).

  • DeltaPy (12,409 Downloads) β€” First tabular data augmentation package in Python (market data) [code][report]
  • PandaPy (21,163 Downloads) β€” Pandas alternative that mimics β€˜Structs’ in the C Language (market data) [code][report]
  • AtsPy (53,147 Downloads) β€” First automated time series package in Python (alternative data) [code][report]
  • DataGene (4,246 Downloads) β€” First package assessing dataset similarity (market data) [code][report]
  • MLAM (35,409 Downloads) β€” First repository for machine learning in asset management (market data) [code][report]

Other packages under this license include MTSS-GAN [code][report], the first multivariate conditional time series generator, FairPut [code][report], a FAIR package using LightGBM, and PandasVault [code], an advanced Pandas repository.

E-Mail Repos Badge firmai

.. .-- .-. .. - . .- - - .... . .--. .- .-. .-.. --- ..- .-.

deltapy's People

Contributors

0xflotus avatar finance-781 avatar firmai avatar volker48 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deltapy's Issues

ValueError: x contains a constant. Adding a constant with trend='c' is not allowed.

Hi,

I am getting the following error with this function:

df_out = interact.autoregression(df.copy()); df_out.head()
D:\Anaconda3\envs\tipjar\lib\site-packages\statsmodels\tsa\ar_model.py:691: FutureWarning: 
statsmodels.tsa.AR has been deprecated in favor of statsmodels.tsa.AutoReg and
statsmodels.tsa.SARIMAX.
AutoReg adds the ability to specify exogenous variables, include time trends,
and add seasonal dummies. The AutoReg API differs from AR since the model is
treated as immutable, and so the entire specification including the lag
length must be specified when creating the model. This change is too
substantial to incorporate into the existing AR api. The function
ar_select_order performs lag length selection for AutoReg models.
AutoReg only estimates parameters using conditional MLE (OLS). Use SARIMAX to
estimate ARX and related models using full MLE via the Kalman Filter.
To silence this warning and continue using AR until it is removed, use:
import warnings
warnings.filterwarnings('ignore', 'statsmodels.tsa.ar_model.AR', FutureWarning)
  warnings.warn(AR_DEPRECATION_WARN, FutureWarning)
Traceback (most recent call last):
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\IPython\core\interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-82-60579f16e2a8>", line 1, in <module>
    df_out = dp.interact.autoregression(x.copy())
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\deltapy\interact.py", line 61, in autoregression
    fitted_model = AR(df.values[i, :]).fit(autoreg_lag)
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\statsmodels\tsa\ar_model.py", line 1223, in fit
    X = self._stackX(k_ar, trend)  # sets self.k_trend
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\statsmodels\tsa\ar_model.py", line 1044, in _stackX
    X = add_trend(X, prepend=True, trend=trend, has_constant="raise")
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\statsmodels\tsa\tsatools.py", line 121, in add_trend
    raise ValueError(msg)
ValueError: x contains a constant. Adding a constant with trend='c' is not allowed

Some functions have look ahead bias

Hi firmai,

Thanks for creating deltapy! I just wanted to point out that not all functions are look ahead safe: their values change if new data is appended to the dataframe, making them hard to use in real life. For example:

a = transform.instantaneous_phases(df, ["close"]).iloc[
    -2
]
b = transform.instantaneous_phases(df.iloc[:-1], ["close"]).iloc[-1]
pd.testing.assert_series_equal(a,b) # -> this fails

I havent tested all, but this applies atleast to the transform.bkb and transform.instantaneous_phases functions.

transform.modify error

Hi,

This one does not find the function magnify. I think it may be something to do with the tsaug package:

df_out = transform.modify(df.copy(),["Close"]); df_out.head()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.