GithubHelp home page GithubHelp logo

shaypal5 / stationarizer Goto Github PK

View Code? Open in Web Editor NEW
29.0 2.0 8.0 202 KB

Smart, automatic detection and stationarization of non-stationary time series data.

License: MIT License

Python 38.78% Jupyter Notebook 61.22%

stationarizer's Introduction

stationarizer ෴

PyPI-Status PyPI-Versions Build-Status Codecov Codacy Badge Requirements Status LICENCE

Smart, automatic detection and stationarization of non-stationary time series data.

>>> from stationarizer import simple_auto_stationarize
>>> simple_auto_stationarize(my_dataframe)
pip install stationarizer
  • Plays nice with pandas.DataFrame inputs.
  • Pure python.
  • Supports Python 3.6+.

The only stationarization pipeline implemented is simple_auto_stationarize, which can be called with:

>>> from stationarizer import simple_auto_stationarize
>>> stationarized_df = simple_auto_stationarize(my_dataframe)

The level to which false discovery rate (FDR) is controled can be configured with the alpha parameter, while the method for multitest error control can be configured with multitest (changing this can change alpha to control for FWER instead).

Currently only the following simple flow - dealing with unit roots - is implemented:

  • Data validation is performed: all columns are checked to be numeric, and the time dimension is assumed to be larger than the number of series (although this is not mandatory, and so only a warning is thrown in case of violation).
  • Both the Augmented Dickey-Fuller unit root test and the KPSS test are performed for each of the series.
  • The p-values of all tests are corrected to control the false discovery rate (FDR) at some given level, using the Benjamini–Yekutieli procedure.
  • The joint ADF-KPSS results are interpreted for each test (see image below).
  • For each time series for which the presence of a unit root cannot be rejected, the series is diffentiated.
  • For each time series for which the presence of a trend cannot be rejected, the series is de-trended.
  • If any series was diffrentiated, then any un-diffrentiated time series (if any) are trimmed by one step to match the resulting series length.

Here is how joint ADF-KPSS results are interpreted, per-series:

joint_kpss_and_adf.jpg

Package author and current maintainer is Shay Palachy ([email protected]); You are more than welcome to approach him for help. Contributions are very welcomed.

Clone:

git clone [email protected]:shaypal5/stationarizer.git

Install in development mode, including test dependencies:

cd stationarizer
pip install -e '.[test]'

To also install fasttext, see instructions in the Installation section.

To run the tests use:

cd stationarizer
pytest

The project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, follow these conventions.

Additionally, if you update this README.rst file, use python setup.py checkdocs to validate it compiles.

Created by Shay Palachy ([email protected]).

stationarizer's People

Contributors

shaypal5 avatar codacy-badger avatar

Stargazers

Ahmet Zamanis avatar Ryan C Yost avatar  avatar Talha Anwar avatar Lion M. avatar  avatar Jacob A Rose avatar Rodrigo Nader avatar Derek Snow avatar Amarpreet Singh avatar  avatar  avatar Yorgo avatar 虚妄之诺 avatar  avatar  avatar Bingliang Li avatar  avatar  avatar  avatar Pankaj Kumar avatar Philip Patterson avatar  avatar Michael Pearmain avatar Mohcine Madkour avatar marcusau avatar Chris Simokat avatar Idan Harel avatar  avatar

Watchers

 avatar  avatar

stationarizer's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.