GithubHelp home page GithubHelp logo

winedarksea / autots Goto Github PK

View Code? Open in Web Editor NEW
1.1K 22.0 97.0 41.35 MB

Automated Time Series Forecasting

License: MIT License

Python 100.00%
time-series machine-learning automl autots forecasting deep-learning preprocessing feature-engineering

autots's Introduction

AutoTS

AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale.

In 2023, AutoTS won in the M6 forecasting competition, delivering the highest performance investment decisions across 12 months of stock market forecasting.

There are dozens of forecasting models usable in the sklearn style of .fit() and .predict(). These includes naive, statistical, machine learning, and deep learning models. Additionally, there are over 30 time series specific transforms usable in the sklearn style of .fit(), .transform() and .inverse_transform(). All of these function directly on Pandas Dataframes, without the need for conversion to proprietary objects.

All models support forecasting multivariate (multiple time series) outputs and also support probabilistic (upper/lower bound) forecasts. Most models can readily scale to tens and even hundreds of thousands of input series. Many models also support passing in user-defined exogenous regressors.

These models are all designed for integration in an AutoML feature search which automatically finds the best models, preprocessing, and ensembling for a given dataset through genetic algorithms.

Horizontal and mosaic style ensembles are the flagship ensembling types, allowing each series to receive the most accurate possible models while still maintaining scalability.

A combination of metrics and cross-validation options, the ability to apply subsets and weighting, regressor generation tools, simulation forecasting mode, event risk forecasting, live datasets, template import and export, plotting, and a collection of data shaping parameters round out the available feature set.

Table of Contents

Installation

pip install autots

This includes dependencies for basic models, but additonal packages are required for some models and methods.

Be advised there are several other projects that have chosen similar names, so make sure you are on the right AutoTS code, papers, and documentation.

Basic Use

Input data for AutoTS is expected to come in either a long or a wide format:

  • The wide format is a pandas.DataFrame with a pandas.DatetimeIndex and each column a distinct series.
  • The long format has three columns:
    • Date (ideally already in pandas-recognized datetime format)
    • Series ID. For a single time series, series_id can be = None.
    • Value
  • For long data, the column name for each of these is passed to .fit() as date_col, id_col, and value_col. No parameters are needed for wide data.

Lower-level functions are only designed for wide style data.

# also load: _hourly, _monthly, _weekly, _yearly, or _live_daily
from autots import AutoTS, load_daily

# sample datasets can be used in either of the long or wide import shapes
long = False
df = load_daily(long=long)

model = AutoTS(
    forecast_length=21,
    frequency='infer',
    prediction_interval=0.9,
    ensemble='auto',
    model_list="fast",  # "superfast", "default", "fast_parallel"
    transformer_list="fast",  # "superfast",
    drop_most_recent=1,
    max_generations=4,
    num_validations=2,
    validation_method="backwards"
)
model = model.fit(
    df,
    date_col='datetime' if long else None,
    value_col='value' if long else None,
    id_col='series_id' if long else None,
)

prediction = model.predict()
# plot a sample
prediction.plot(model.df_wide_numeric,
                series=model.df_wide_numeric.columns[0],
                start_date="2019-01-01")
# Print the details of the best model
print(model)

# point forecasts dataframe
forecasts_df = prediction.forecast
# upper and lower forecasts
forecasts_up, forecasts_low = prediction.upper_forecast, prediction.lower_forecast

# accuracy of all tried model results
model_results = model.results()
# and aggregated from cross validation
validation_results = model.results("validation")

The lower-level API, in particular the large section of time series transformers in the scikit-learn style, can also be utilized independently from the AutoML framework.

Check out extended_tutorial.md for a more detailed guide to features.

Also take a look at the production_example.py

Tips for Speed and Large Data:

  • Use appropriate model lists, especially the predefined lists:
    • superfast (simple naive models) and fast (more complex but still faster models, optimized for many series)
    • fast_parallel (a combination of fast and parallel) or parallel, given many CPU cores are available
      • n_jobs usually gets pretty close with ='auto' but adjust as necessary for the environment
    • 'scalable' is the best list to avoid crashing when many series are present. There is also a transformer_list = 'scalable'
    • see a dict of predefined lists (some defined for internal use) with from autots.models.model_list import model_lists
  • Use the subset parameter when there are many similar series, subset=100 will often generalize well for tens of thousands of similar series.
    • if using subset, passing weights for series will weight subset selection towards higher priority series.
    • if limited by RAM, it can be distributed by running multiple instances of AutoTS on different batches of data, having first imported a template pretrained as a starting point for all.
  • Set model_interrupt=True which passes over the current model when a KeyboardInterrupt ie crtl+c is pressed (although if the interrupt falls between generations it will stop the entire training).
  • Use the result_file method of .fit() which will save progress after each generation - helpful to save progress if a long training is being done. Use import_results to recover.
  • While Transformations are pretty fast, setting transformer_max_depth to a lower number (say, 2) will increase speed. Also utilize transformer_list == 'fast' or 'superfast'.
  • Check out this example of using AutoTS with pandas UDF.
  • Ensembles are obviously slower to predict because they run many models, 'distance' models 2x slower, and 'simple' models 3x-5x slower.
    • ensemble='horizontal-max' with model_list='no_shared_fast' can scale relatively well given many cpu cores because each model is only run on the series it is needed for.
  • Reducing num_validations and models_to_validate will decrease runtime but may lead to poorer model selections.
  • For datasets with many records, upsampling (for example, from daily to monthly frequency forecasts) can reduce training time if appropriate.
    • this can be done by adjusting frequency and aggfunc but is probably best done before passing data into AutoTS.
  • It will be faster if NaN's are already filled. If a search for optimal NaN fill method is not required, then fill any NaN with a satisfactory method before passing to class.
  • Set runtime_weighting in metric_weighting to a higher value. This will guide the search towards faster models, although it may come at the expense of accuracy.
  • Memory shortage is the most common cause of random process/kernel crashes. Try testing a data subset and using a different model list if issues occur. Please also report crashes if found to be linked to a specific set of model parameters (not AutoTS parameters but the underlying forecasting model params). Also crashes vary significantly by setup such as underlying linpack/blas so seeing crash differences between environments can be expected.

How to Contribute:

  • Give feedback on where you find the documentation confusing
  • Use AutoTS and...
    • Report errors and request features by adding Issues on GitHub
    • Posting the top model templates for your data (to help improve the starting templates)
    • Feel free to recommend different search grid parameters for your favorite models
  • And, of course, contributing to the codebase directly on GitHub.

Also known as Project CATS (Catlin's Automated Time Series) hence the logo.

autots's People

Contributors

winedarksea avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autots's Issues

Their is No Detail about train Data and Test data and no information about [ forecast_length ] in documentation??

Hi Dear. hope you all doing good i want to ask some question about some basic things like
how give train data ?
how we can give test data ?
how we can give target colum ?
As i read doc but i never i understand forecast_length what this use for?
last question is that
as i read in doc catgegorical data must label encode before to feed and is it must to use all int dtype or we can use float dtype columns as well please helpme to answere
i would realy appreciate your help.
Thanks

Evaluation Metrics are missing and all models have failed, by an error in template or metrics

Describe the bug
I tried to use DNN models presented in the package such as GluonTS, pytorch-forecasting and tensorflowSTS. I have installed all required packages as in the installation guide (My environment can be viewed below). The error I got from the fit() function read as:

"Evaluation Metrics are missing and all models have failed, by an error in template or metrics. There are many possible causes for this, bad parameters, environment, or an unreported bug. Usually this means you are missing required packages for the models like fbprophet or gluonts, or that the models in model_list are inappropriate for your data. A new starting template may also help".

To Reproduce
When I run the following code:

from autots import AutoTS
model_list = ['TensorflowSTS']
model = AutoTS(
    forecast_length=7,
    frequency='D',
    prediction_interval=0.9,
    metric_weighting=metric_weighting,
    ensemble='horizontal',
    holiday_country='UK',
    model_list=model_list,  
    drop_most_recent=0,
    max_generations=1,
    num_validations=3,
    validation_method="backwards",
    no_negatives=True,
    verbose=-2,
    n_jobs = 72
)

model = model.fit(train_df)

Expected behavior
Stack trace:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/pandas/core/indexes/base.py:3621, in Index.get_loc(self, key, method, tolerance)
   3620 try:
-> 3621     return self._engine.get_loc(casted_key)
   3622 except KeyError as err:

File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/pandas/_libs/index.pyx:136, in pandas._libs.index.IndexEngine.get_loc()

File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/pandas/_libs/index.pyx:163, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:5198, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:5206, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'smape'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/autots/evaluator/auto_model.py:2031, in generate_score(model_results, metric_weighting, prediction_interval)
   2030 # not sure why there are negative SMAPE values, but make sure they get dealt with
-> 2031 if model_results['smape'].min() < 0:
   2032     model_results['smape'] = model_results['smape'].where(
   2033         model_results['smape'] >= 0, model_results['smape'].max()
   2034     )

File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/pandas/core/frame.py:3505, in DataFrame.__getitem__(self, key)
   3504     return self._getitem_multilevel(key)
-> 3505 indexer = self.columns.get_loc(key)
   3506 if is_integer(indexer):

File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/pandas/core/indexes/base.py:3623, in Index.get_loc(self, key, method, tolerance)
   3622 except KeyError as err:
-> 3623     raise KeyError(key) from err
   3624 except TypeError:
   3625     # If we have a listlike key, _check_indexing_error will raise
   3626     #  InvalidIndexError. Otherwise we fall through and re-raise
   3627     #  the TypeError.

KeyError: 'smape'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
File <timed exec>:44, in <module>

File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/autots/evaluator/auto_ts.py:791, in AutoTS.fit(self, df, date_col, value_col, id_col, future_regressor, weights, result_file, grouping_ids, validation_indexes)
    789 # capture the data from the lower level results
    790 self.initial_results = self.initial_results.concat(template_result)
--> 791 self.initial_results.model_results['Score'] = generate_score(
    792     self.initial_results.model_results,
    793     metric_weighting=metric_weighting,
    794     prediction_interval=prediction_interval,
    795 )
    796 if result_file is not None:
    797     self.initial_results.save(result_file)

File ~/anaconda3/envs/forecast/lib/python3.10/site-packages/autots/evaluator/auto_model.py:2126, in generate_score(model_results, metric_weighting, prediction_interval)
   2123         overall_score = overall_score + (containment_score * containment_weighting)
   2125 except Exception as e:
-> 2126     raise KeyError(
   2127         f"""Evaluation Metrics are missing and all models have failed, by an error in template or metrics.
   2128         There are many possible causes for this, bad parameters, environment, or an unreported bug.
   2129         Usually this means you are missing required packages for the models like fbprophet or gluonts,
   2130         or that the models in model_list are inappropriate for your data.
   2131         A new starting template may also help. {repr(e)}"""
   2132     )
   2134 return overall_score.astype(float)

Desktop (please complete the following information):

  • OS: Ubuntu 20.04
  • Package Versions 0.4.2

``

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge
absl-py 1.1.0 pyhd8ed1ab_0 conda-forge
aiohttp 3.8.1 py310h5764c6d_1 conda-forge
aiosignal 1.2.0 pyhd8ed1ab_0 conda-forge
alembic 1.8.0 pyhd8ed1ab_0 conda-forge
appdirs 1.4.4 pypi_0 pypi
arch 5.3.1 pypi_0 pypi
argon2-cffi 21.3.0 pypi_0 pypi
argon2-cffi-bindings 21.2.0 pypi_0 pypi
asttokens 2.0.5 pypi_0 pypi
astunparse 1.6.3 pyhd8ed1ab_0 conda-forge
async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge
attrs 21.4.0 pyhd8ed1ab_0 conda-forge
autopage 0.5.1 pyhd8ed1ab_0 conda-forge
autots 0.4.2 pyhd8ed1ab_0 conda-forge
backcall 0.2.0 pypi_0 pypi
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge
beautifulsoup4 4.11.1 pypi_0 pypi
blas 1.0 mkl conda-forge
bleach 5.0.0 pypi_0 pypi
blinker 1.4 py_1 conda-forge
brotli 1.0.9 h166bdaf_7 conda-forge
brotli-bin 1.0.9 h166bdaf_7 conda-forge
brotlipy 0.7.0 py310h5764c6d_1004 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2022.6.15 ha878542_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cachetools 5.0.0 pyhd8ed1ab_0 conda-forge
certifi 2022.6.15 py310hff52083_0 conda-forge
cffi 1.15.0 py310h0fdd8cc_0 conda-forge
charset-normalizer 2.0.12 pyhd8ed1ab_0 conda-forge
click 8.1.3 py310hff52083_0 conda-forge
cliff 3.10.1 pyhd8ed1ab_0 conda-forge
clikit 0.6.2 pypi_0 pypi
cloudpickle 2.1.0 pyhd8ed1ab_0 conda-forge
cmaes 0.8.2 pyh44b312d_0 conda-forge
cmd2 2.3.3 py310hff52083_1 conda-forge
cmdstanpy 0.9.5 pypi_0 pypi
colorama 0.4.5 pyhd8ed1ab_0 conda-forge
colorlog 6.6.0 py310hff52083_1 conda-forge
convertdate 2.4.0 pypi_0 pypi
crashtest 0.3.1 pypi_0 pypi
cryptography 37.0.1 py310h9ce1e76_0
cudatoolkit 11.3.1 h9edb442_10 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
cython 0.29.30 pypi_0 pypi
dask 2022.6.1 pypi_0 pypi
debugpy 1.6.0 pypi_0 pypi
decorator 5.1.1 pypi_0 pypi
defusedxml 0.7.1 pypi_0 pypi
dill 0.3.5.1 pypi_0 pypi
distributed 2022.6.1 pypi_0 pypi
entrypoints 0.4 pypi_0 pypi
ephem 4.1.3 pypi_0 pypi
executing 0.8.3 pypi_0 pypi
fastjsonschema 2.15.3 pypi_0 pypi
fonttools 4.33.3 pypi_0 pypi
freetype 2.10.4 h0708190_1 conda-forge
frozenlist 1.3.0 py310h5764c6d_1 conda-forge
fsspec 2022.5.0 pyhd8ed1ab_0 conda-forge
future 0.18.2 py310hff52083_5 conda-forge
gast 0.5.3 pyhd8ed1ab_0 conda-forge
giflib 5.2.1 h36c2ea0_2 conda-forge
gluonts 0.10.1 pypi_0 pypi
google-auth 2.9.0 pyh6c4a22f_0 conda-forge
google-auth-oauthlib 0.4.1 py_2 conda-forge
google-pasta 0.2.0 pyh8c360ce_0 conda-forge
greenlet 1.1.2 py310hd8f1fbe_2 conda-forge
greykite 0.3.0 pypi_0 pypi
grpc-cpp 1.45.2 h3b8df00_4 conda-forge
grpcio 1.45.0 py310h44b9e0c_0 conda-forge
h5py 3.7.0 nompi_py310h06dffec_100 conda-forge
hdf5 1.12.1 nompi_h2386368_104 conda-forge
heapdict 1.0.1 pypi_0 pypi
hijri-converter 2.2.4 pypi_0 pypi
holidays 0.14.2 pypi_0 pypi
holidays-ext 0.0.7 pypi_0 pypi
httpstan 4.7.2 pypi_0 pypi
icu 70.1 h27087fc_0 conda-forge
idna 3.3 pyhd8ed1ab_0 conda-forge
importlib-metadata 4.11.4 py310hff52083_0 conda-forge
importlib_resources 5.8.0 pyhd8ed1ab_0 conda-forge
intel-openmp 2022.0.1 h06a4308_3633
ipykernel 6.15.0 pypi_0 pypi
ipython 8.4.0 pypi_0 pypi
ipython-genutils 0.2.0 pypi_0 pypi
jedi 0.18.1 pypi_0 pypi
jinja2 3.1.2 pypi_0 pypi
joblib 1.1.0 pyhd8ed1ab_0 conda-forge
jpeg 9e h166bdaf_2 conda-forge
jsonschema 4.6.0 pypi_0 pypi
jupyter-client 7.3.4 pypi_0 pypi
jupyter-console 6.4.4 pypi_0 pypi
jupyter-core 4.10.0 pypi_0 pypi
jupyter-http-over-ws 0.0.8 pypi_0 pypi
jupyterlab-pygments 0.2.2 pypi_0 pypi
jupyterlab-widgets 1.1.1 pypi_0 pypi
keras 2.8.0 pyhd8ed1ab_0 conda-forge
keras-preprocessing 1.1.2 pyhd8ed1ab_0 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
kiwisolver 1.4.3 py310hbf28c38_0 conda-forge
korean-lunar-calendar 0.2.1 pypi_0 pypi
krb5 1.19.3 h3790be6_0 conda-forge
lcms2 2.12 hddcbb42_0 conda-forge
ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge
lerc 3.0 h9c3ff4c_0 conda-forge
libblas 3.9.0 14_linux64_mkl conda-forge
libbrotlicommon 1.0.9 h166bdaf_7 conda-forge
libbrotlidec 1.0.9 h166bdaf_7 conda-forge
libbrotlienc 1.0.9 h166bdaf_7 conda-forge
libcblas 3.9.0 14_linux64_mkl conda-forge
libcurl 7.83.1 h7bff187_0 conda-forge
libdeflate 1.12 h166bdaf_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.1.0 h8d9b700_16 conda-forge
libgfortran-ng 12.1.0 h69a702a_16 conda-forge
libgfortran5 12.1.0 hdcd56e2_16 conda-forge
libgomp 12.1.0 h8d9b700_16 conda-forge
liblapack 3.9.0 14_linux64_mkl conda-forge
libnghttp2 1.47.0 h727a467_0 conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.20 pthreads_h78a6416_0 conda-forge
libpng 1.6.37 h753d276_3 conda-forge
libprotobuf 3.20.1 h6239696_0 conda-forge
libssh2 1.10.0 ha56f1ee_2 conda-forge
libstdcxx-ng 12.1.0 ha89aaad_16 conda-forge
libtiff 4.4.0 hc85c160_1 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libwebp 1.2.2 h3452ae3_0 conda-forge
libwebp-base 1.2.2 h7f98852_1 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libzlib 1.2.12 h166bdaf_1 conda-forge
lightgbm 3.3.2 py310h122e73d_0 conda-forge
locket 1.0.0 pypi_0 pypi
lunarcalendar 0.0.9 pypi_0 pypi
lz4-c 1.9.3 h9c3ff4c_1 conda-forge
mako 1.2.1 pyhd8ed1ab_0 conda-forge
markdown 3.3.7 pyhd8ed1ab_0 conda-forge
markupsafe 2.1.1 py310h5764c6d_1 conda-forge
marshmallow 3.16.0 pypi_0 pypi
matplotlib-base 3.5.2 py310h5701ce4_0 conda-forge
matplotlib-inline 0.1.3 pypi_0 pypi
mistune 0.8.4 pypi_0 pypi
mkl 2022.0.1 h06a4308_117
msgpack 1.0.4 pypi_0 pypi
multidict 6.0.2 py310h5764c6d_1 conda-forge
munkres 1.1.4 pyh9f0ad1d_0 conda-forge
mxnet-cu112 1.9.1 pypi_0 pypi
nbclient 0.6.4 pypi_0 pypi
nbconvert 6.5.0 pypi_0 pypi
nbformat 5.4.0 pypi_0 pypi
ncurses 6.3 h27087fc_1 conda-forge
nest-asyncio 1.5.5 pypi_0 pypi
notebook 6.4.12 pypi_0 pypi
numpy 1.23.0 py310h53a5b5f_0 conda-forge
oauthlib 3.2.0 pyhd8ed1ab_0 conda-forge
openjpeg 2.4.0 hb52868f_1 conda-forge
openssl 1.1.1q h166bdaf_0 conda-forge
opt_einsum 3.3.0 pyhd8ed1ab_1 conda-forge
optuna 2.10.1 pyhd8ed1ab_0 conda-forge
packaging 21.3 pyhd8ed1ab_0 conda-forge
pandas 1.4.3 py310h769672d_0 conda-forge
pandocfilters 1.5.0 pypi_0 pypi
parso 0.8.3 pypi_0 pypi
partd 1.2.0 pypi_0 pypi
pastel 0.2.1 pypi_0 pypi
patsy 0.5.2 pyhd8ed1ab_0 conda-forge
pbr 5.9.0 pyhd8ed1ab_0 conda-forge
pexpect 4.8.0 pypi_0 pypi
pickleshare 0.7.5 pypi_0 pypi
pillow 9.1.1 pypi_0 pypi
pip 22.1.2 pyhd8ed1ab_0 conda-forge
pmdarima 1.8.5 pypi_0 pypi
prettytable 3.3.0 pypi_0 pypi
prometheus-client 0.14.1 pypi_0 pypi
prompt-toolkit 3.0.29 pypi_0 pypi
property-cached 1.6.4 pypi_0 pypi
prophet 1.1 pypi_0 pypi
protobuf 3.20.1 py310hd8f1fbe_0 conda-forge
psutil 5.9.1 pypi_0 pypi
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pypi_0 pypi
pure-eval 0.2.2 pypi_0 pypi
pyasn1 0.4.8 py_0 conda-forge
pyasn1-modules 0.2.7 py_0 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pydantic 1.9.1 pypi_0 pypi
pydeprecate 0.3.2 pyhd8ed1ab_0 conda-forge
pygments 2.12.0 pypi_0 pypi
pyjwt 2.4.0 pyhd8ed1ab_0 conda-forge
pylev 1.4.0 pypi_0 pypi
pymeeus 0.5.11 pypi_0 pypi
pyopenssl 22.0.0 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge
pyperclip 1.8.2 pyhd8ed1ab_2 conda-forge
pyrsistent 0.18.1 pypi_0 pypi
pysimdjson 3.2.0 pypi_0 pypi
pysocks 1.7.1 py310hff52083_5 conda-forge
pystan 3.4.0 pypi_0 pypi
python 3.10.5 h582c2e5_0_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python-flatbuffers 2.0 pyhd8ed1ab_0 conda-forge
python_abi 3.10 2_cp310 conda-forge
pytorch 1.12.0 py3.10_cuda11.3_cudnn8.3.2_0 pytorch
pytorch-forecasting 0.10.2 pyhd8ed1ab_0 conda-forge
pytorch-lightning 1.6.4 pyhd8ed1ab_0 conda-forge
pytorch-mutex 1.0 cuda pytorch
pytz 2022.1 pyhd8ed1ab_0 conda-forge
pyu2f 0.1.5 pyhd8ed1ab_0 conda-forge
pyyaml 5.4.1 pypi_0 pypi
pyzmq 23.2.0 pypi_0 pypi
qtconsole 5.3.1 pypi_0 pypi
qtpy 2.1.0 pypi_0 pypi
re2 2022.06.01 h27087fc_0 conda-forge
readline 8.1.2 h0f457ee_0 conda-forge
requests 2.28.0 pyhd8ed1ab_0 conda-forge
requests-oauthlib 1.3.1 pyhd8ed1ab_0 conda-forge
rsa 4.8 pyhd8ed1ab_0 conda-forge
scikit-learn 1.1.1 py310hffb9edd_0 conda-forge
scipy 1.8.1 py310h7612f91_0 conda-forge
seaborn 0.11.2 pypi_0 pypi
send2trash 1.8.0 pypi_0 pypi
setuptools 59.5.0 py310hff52083_0 conda-forge
setuptools-git 1.2 pypi_0 pypi
six 1.16.0 pyh6c4a22f_0 conda-forge
snappy 1.1.9 hbd366e4_1 conda-forge
sortedcontainers 2.4.0 pypi_0 pypi
soupsieve 2.3.2.post1 pypi_0 pypi
sqlalchemy 1.4.39 py310h5764c6d_0 conda-forge
sqlite 3.38.5 h4ff8645_0 conda-forge
stack-data 0.3.0 pypi_0 pypi
statsmodels 0.13.2 py310hde88566_0 conda-forge
stevedore 3.5.0 py310hff52083_3 conda-forge
tblib 1.7.0 pypi_0 pypi
tensorboard 2.8.0 pyhd8ed1ab_1 conda-forge
tensorboard-data-server 0.6.0 py310h597c629_2 conda-forge
tensorboard-plugin-wit 1.8.1 pyhd8ed1ab_0 conda-forge
tensorflow 2.8.1 cpu_py310hd1aba9c_0 conda-forge
tensorflow-base 2.8.1 cpu_py310h17449b8_0 conda-forge
tensorflow-estimator 2.8.1 cpu_py310had6d012_0 conda-forge
termcolor 1.1.0 pyhd8ed1ab_3 conda-forge
terminado 0.15.0 pypi_0 pypi
threadpoolctl 3.1.0 pyh8a188c0_0 conda-forge
tinycss2 1.1.1 pypi_0 pypi
tk 8.6.12 h27826a3_0 conda-forge
toolz 0.11.2 pypi_0 pypi
torchmetrics 0.9.2 pyhd8ed1ab_0 conda-forge
tornado 6.1 pypi_0 pypi
tqdm 4.64.0 pyhd8ed1ab_0 conda-forge
traitlets 5.3.0 pypi_0 pypi
typing-extensions 4.3.0 hd8ed1ab_0 conda-forge
typing_extensions 4.3.0 pyha770c72_0 conda-forge
tzdata 2022a h191b570_0 conda-forge
ujson 5.3.0 pypi_0 pypi
unicodedata2 14.0.0 py310h5764c6d_1 conda-forge
urllib3 1.26.9 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webargs 8.1.0 pypi_0 pypi
webencodings 0.5.1 pypi_0 pypi
werkzeug 2.1.2 pyhd8ed1ab_1 conda-forge
wheel 0.37.1 pyhd8ed1ab_0 conda-forge
widgetsnbextension 3.6.1 pypi_0 pypi
wrapt 1.14.1 py310h5764c6d_0 conda-forge
xgboost 1.6.1 pypi_0 pypi
xlrd 2.0.1 pypi_0 pypi
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h7f98852_2 conda-forge
yarl 1.7.2 py310h5764c6d_2 conda-forge
zict 2.2.0 pypi_0 pypi
zipp 3.8.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.12 h166bdaf_1 conda-forge
zstd 1.5.2 h8a70e8d_2 conda-forge
``

Add a parameter to hide all console messages from AutoTS::fit() method.

When we call the fit() method of the AutoTS class, the console is filled with messages which most users do not need to read. It is advisable to have a straightforward way to disable these messages.

Ideally, providing a boolean parameter called verbose/silent to hide these messages should solve the problem. I would recommend defaulting it to False.

There are three ways to do this:

  1. Create a new variable called silent.
  2. Rename the currently used verbose variable to something else and create a new boolean variable named verbose.
  3. Modify the existing verbose variable to do what we want.

Is SinTrend scalable?

Good question, I have no idea.
Make more scalable (multiprocessing?) if necessary, and adjust which transformer_list it is in.

Dynamic factor, VECM, VAR and FBProphet error

When I try to train one of these models individually or in a group, I have the following errror:

KeyError: 'Inconceivable! Evaluation Metrics are missing and all models have failed, by an error in TemplateWizard or metrics. A new template may help, or an adjusted model_list.'

Here is a simplified version of my code, which employs random numbers instead of fixed numbers and has the same problem:

`import numpy as np
from autots import AutoTS
import pandas as pd
import datetime
from datetime import timedelta

Model configuration

model_list = [
'DynamicFactor',
'VECM',
'VAR',
'FBProphet',
]

metric_weighting = {
'smape_weighting' : 0,
'mae_weighting' : 30,
'rmse_weighting' : 5,
'containment_weighting' : 0,
'runtime_weighting' : 5,
'spl_weighting': 0,
'contour_weighting': 0,
}

model = AutoTS(
forecast_length=1,
frequency='infer',
prediction_interval=0.95,
no_negatives=False,
ensemble='simple',
constraint=0,
max_generations=300,
max_per_model_class=1,
na_tolerance=0.25,
validation_method='seasonal 14',
model_list=model_list,
models_to_validate=1,
num_validations=6,
drop_data_older_than_periods=14,
n_jobs='auto',
metric_weighting=metric_weighting,
)

Build date list for the dataframe index

InitialDate=datetime.datetime(2013,1,6)
DateFinal=datetime.datetime(2018,12,22)
DaysNumber=(DateFinal-InitialDate).days+1
Dates=list()
Date=InitialDate
for dia in range(0,DaysNumber):
Dates.insert(dia,Date)
Date=Date + timedelta(days=1)

Load time series data

DFTimeSeries=pd.DataFrame(np.random.rand(len(Dates),1))
DFTimeSeries.index=Dates

Training

model = model.fit(DFTimeSeries,result_file='pickle')
`

I am working with Python 3.8.5 64-bit with Conda

Question: Multi-Objective Forecasting

Problem: A dataset with multiple data columns that may or may not be temporally coupled, and also multiple output sets.
e.g. market sectors and wellbeing index as columns, see if one index is tied to the rest of the indices.

Fitted Values

Hi,

I would like to obtain the fitted values of the best model, is this possible?

Allow use to specify target and features

I cannot understand how autoLS will understand which is my target. For example, passing a wide dataframe how can I understand that that y is my target and x my features for predicting the target?

Add accessibility to subdivision keyword (states, ...) from the holidays package?

Hi,

Is your feature request related to a problem? Please describe.
It would be very nice to also have accessibility to the holidays package "subdiv" keyword, which allows to specify a country's subdivision. It can be the case, that different states have different holidays, which can influence some time series.

Describe the solution you'd like
Allow for a "holiday_subdiv" list in "create_regressor", which should not be much effort, as the functionality is offered by an already used package.

Additional context
This is my first time commenting on an issue or adding a feature request. I very much appreciate your work!

Looking forward hearing from you!

Improve 'Simple' Ensemble model selection

Currently only chooses best "overall" models and should be expanded to target particular clusters or types of series within multivariate datasets, making more valuable for inclusion in horizontal ensembles.

Results are not reproducible

I have tried to use this library for my work but the results are not reproducible. Can you please tell why is it happening?

WindowRegression is too slow at scale with some parameters

WindowRegression (located in autots.models.sklearn) is the name here for turning a time series into a regression problem by means of taking "windows" or snapshots of data immediately preceding the forecast target, and then using them as the X/dependent variables in the model.

However, the current implementation tries to do a little too much at once. In particular this occurs with the input_dim and output_dim parameters. Given a "multivariate" input dim and a "forecast_length" output dim, this results in a regression model with an Y/target size of n_series * forecast_length. Since many regression models in turn rely on MultiOutputRegressor to predict such multivariate Y's (which generates a unique model for each Y) this quickly becomes an impossibly huge model to train.

While I think there is value in most of the current parameters, it would probably make the most sense to split this model into two separate models - basically a FastWindowRegression and a SlowWindowRegression (not called that exactly, of course).

Great Work!

Hi man,

I can't believe I discovered the project just now! I thought I was the only one scratching my head as to why something like this does not exists. You seem to have progressed further than me, I have been busy with it just the last week or so. A few of the things I want to do you already seemed to have done. The following is what I want to work on:

AtsPy Future Development

1. Additional in-sample validation steps to stop deep learning models from over and underfitting.
1. Extra performance metrics like MAPE and MAE.
1. Improved methods to select the window lenght to use in training and calibrating the model.
1. Add the ability to accept dirty data, and have the ability to clean it up, inetrpolation etc.
1. Add a function to resample to a larger frequency for big datasets.
1. Add the ability to algorithmically select a good enough chunk of a large dataset to balance performance and time to train.
1. More internal model optimisation using AIC, BIC an AICC.
1. Code annotations for other developers to follow and improve on the work being done.
1. Force seasonality stability between in and out of sample training models.
1. Make AtsPy less dependency heavy, currently it draws on tensorflow, pytorch and mxnet.

I am going to take a small break from development for the next month to write a paper, but I look forward to learning from your project and including some tidbits of your work in my own. For what its worth, I am going to make you a collaborator for AtsPy so if you see anything out of the ordinary don't be afraid to tinker. We might be up to something slightly different, in my project I want to emphasise the "no effort" factor and give users the ability to retrieve the model and access parameters. If our projects converge then maybe we should think about collaboration.

Regards,
Derek

edit AtsPy - https://github.com/firmai/atspy

Build out pytests tests

Not necessarily comprehensive yet:

  • for some transformers
  • import and export of templates
  • generating forecasts of right size and within x % of data
  • reshape of long to wide
  • and more...

Cant' get high VARMAX orders from AutoTS

No matter the value that I set in max_generations arg, the orders of VARMAX that I get are always between 0 and 2.
Is there is any way to make AutoTS tries higher orders for VARIMAX models ?
I tried various parameters with AutoTS, here is an example of how I create a model and fit it.

model = AutoTS(
    forecast_length=9,
    frequency='M',
    prediction_interval=0.9,
    model_list=["VARMAX"],
    ensemble="all",
    max_generations=100,
    num_validations=2,
    validation_method="backwards",
    random_seed=7,
    models_to_validate = 0.2,
    transformer_list="all",
    transformer_max_depth=8
)
# VARMA_train_data is a wide data set of 6 columns and 54 monthly observasions
model.fit(VARMA_train_data)

Forecast horizon issue

trafficTest JupyterNB html.zip
Hi,

My historical data for training was from
01-04-2020 (start) to 01-10-2020 (end) as a univariate time series with website visitor traffic as the value having a daily frequency.

When I tried forecasting for a 30 day horizon by setting forecast_length=30, frequency='infer' , I was getting the forecast for December'2020 (instead of forecast horizon start from 02/10/20) . I then changed the frequency setting from 'infer' to '1D' and tried the predictions and observed the same result in the forecast.

Then, I tried setting forecast_length=5 and frequency='MS' . For this , I got the forecast starting from 1/1/2021 as shown below. November and December 2020 was missing.

2021-01-01	465546.5
2021-02-01	61375.0
2021-03-01	39539.5
2021-04-01	33329.0
2021-05-01	33581.0

--
Attached herewith is the traffic dataset and the Jupyter Notebook run output for reference.

--

traffic.zip

Expand statsmodels models

Statsmodels has added some new models, and others were never included, such as:
ARDL
UECM
ThetaModel
MarkovRegression
SVAR

some of these may make valuable additions

Wide Format

Hello ,

Can i create a multiseries multivariate model using wide format ? this feature is available in the long format , but not there in the wide format ? If there is no such features available then we need to do one by one ?

my data looks like this

store A - Product A (time series data) Product B (time series Data) so on
Store B - Also has similar pattern .

Also i need to add external regressor. My external regressor has more than one variable. One idea is to do a dimension reduction and take it ? Or can i provide multi dimension data for external regressor ?

Please help

Make accuracy more visible to users

  1. Allow users to more easily view ongoing accuracy during metrics
  2. Allow users to easily print accuracy of the chosen best model - what the accuracy was in train and validation segments
  3. (potentially) add easy graphic of forecasts or a graph to quickly show max accuracy for each generation.

inspired by #61

Production mode

assuming i already trained a model and now i went use it during production.
now let say my data frequently is daily and every day i am getting a new daily data and i need to predict the next day
how do i add the new daily data to my model without training it from the beginning?
can i use my trained model and giving prediction for every day?
or i have to train my model every time i am getting new daily data and then making predictions for the next day
thanks

Question: Probabilistic model

Hi,

You have mentioned in the document that
Probabilistic forecasts are available for all models, but in many cases are just data-based estimates in lieu of model estimates, so be careful. upper_forecasts_df = prediction.upper_forecast lower_forecasts_df = prediction.lower_forecast
Say, If I input only 'probabilistic' in the model list and run a forecast, does this statement still hold good? If not , how can we get probabilistic forecasting from probabilistic models like tensor-flow probability ? I mean , I want to get the 25th and 75th percentile forecasts from the forecast horizon distribution?

Thanks in advance.

how to save the model

Hello ,

Can we pickle the model ? and use it for inferencing later ?

Please suggest

Better model selection for horizontal ensembling

Horizontal 'small' option utilizing fewer models.
Cleanse similar models out first, before horizontal ensembling
Horizontal ensemble run/chosen on first validation (which is subset) -keep this??

Why is the best result returned by the model not the best according to the resulting dataframe?

Hi,

long = True
# df = load_monthly(long=long)

from autots import AutoTS

metric_weighting = {
    "smape_weighting": 5,
    "mae_weighting": 1,
    "rmse_weighting": 0,
    "containment_weighting": 0,
    "runtime_weighting": 0,
    "spl_weighting": 0,
    "contour_weighting": 0,
}

model = AutoTS(
    forecast_length=30,
    frequency="infer",
    prediction_interval=0.9,
    ensemble=None,
    model_list="superfast",
    transformer_list="fast",
    max_generations=15,
    num_validations=5,
    validation_method="backwards",
    metric_weighting=metric_weighting,
)
model = model.fit(
    df,
    date_col="ds" if long else None,
    value_col="y" if long else None,
    id_col="series" if long else None,
)

prediction = model.predict()

forecasts_df = prediction.forecast
# upper and lower forecasts
forecasts_up, forecasts_low = prediction.upper_forecast, prediction.lower_forecast

# accuracy of all tried model results
model_results = model.results()
validation_results = model.results("validation")

After the run, I get the following Initiated AutoTS object with best model: AverageValueNaive, but then if I sort the validation_results dataframe by Score or smape, the first place is taken by another model, not this one.

Probably, I don't completely understand the way the best model is reported. Thank you in advance!

cannot import name 'AutoTS'

Traceback (most recent call last):
File "C:/Zero/Python/时间序列比较/autots.py", line 8, in
from autots import AutoTS, load_daily
File "C:\Zero\Python\时间序列比较\autots.py", line 8, in
from autots import AutoTS, load_daily
ImportError: cannot import name 'AutoTS'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.