GithubHelp home page GithubHelp logo

dataiku / dss-plugin-timeseries-forecast Goto Github PK

View Code? Open in Web Editor NEW
18.0 22.0 7.0 784 KB

Dataiku DSS plugin to automate time series forecasting with Deep Learning and statistical models ๐Ÿ“ˆ

Home Page: https://www.dataiku.com/product/plugins/timeseries-forecast/

License: Apache License 2.0

Makefile 1.70% Python 98.30%
dataiku dss-plugin forecast forecasting time-series gluonts deep-learning arima deepar transformer

dss-plugin-timeseries-forecast's Introduction

Time Series Forecast Plugin (deprecated)

Build status GitHub release (latest by date) Support level

This Dataiku DSS plugin provides recipes to forecast multivariate time series from year to minute frequency with Deep Learning and statistical models.

Documentation: https://www.dataiku.com/product/plugins/timeseries-forecast/

โš ๏ธ Starting with DSS version 11 this plugin is considered as "deprecated", we recommend using the native time series forecasting features.

Release notes

See the changelog for a history of notable changes to this plugin.

License

This plugin is distributed under the Apache License version 2.0.

dss-plugin-timeseries-forecast's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dss-plugin-timeseries-forecast's Issues

Time-Series Forecast Plug-In - Deep Learning Models Issue - 'Batch Size'

Hi!

This is just perhaps a minor bug / issue which we may be facing. We have downloaded the latest Forecast plug-in , version 1.2.0 from the following link : https://cdn.downloads.dataiku.com/public/dss-plugins/timeseries-forecast/

However, with forecasting mode as 'Auto-ML' and for all other modes, we encounter the error where in the 'batch size' parameter for the Deep-Learning models is not recognized and it pops the error, thereby rendering an unsuccessful run of the plug-in.

Is this an inherent plug-in issue or perhaps something configurable on our end.

  • Browser: Chrome
  • DSS version 9.0.4
  • Python Version 3.6

Thanks!

Writing training output to HDFS fails with 'Pathname [path] is not a valid DFS filename.'

Describe the bug
Writing training output to HDFS fails with 'Pathname [path] is not a valid DFS filename.'

To Reproduce
Steps to reproduce the behavior:

  1. Select a dataset.
  2. Click the forecast plugin.
  3. Select the training option (number 1).
  4. Select an HDFS connection for the metrics dataset.

Expected behavior
Plugin writes output to specified locations.

Root cause
HDFS doesn't allow colons in paths: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/model.html#Paths_and_Path_Elements

Suggested fix
In dss-plugin-timeseries-forecast/custom-recipes/timeseries-forecast-1-train-evaluate/recipe.py, change line 29 from
session_name = datetime.utcnow().isoformat() + "Z"
to
session_name = datetime.utcnow().isoformat().replace(':', '.') + "Z"

Feature importance

Is your feature request related to a problem? Please describe.
Unable to know which feature is contributing to my dependent variable

Describe the solution you'd like
Feature importance chart will help us to understand which features to keep in model and which one to remove

Describe alternatives you've considered
No alternatives considered as I am stuck and dont know to interpret the results

Additional context
Add any other context or screenshots about the feature request here.

Forecast future values > UX enhancements

  • Input / Ouput
    • Description of external features dataset: "timeseries identifiers..." > "primary key columns if long format was used"
  • Model selection
    • Performance metric should be MASE by default, it's more common than sMAPE, and as robust
    • What does ND mean?
    • Typo in "Mean weighted Qauntile Loss"
    • Model manual selection: model names should be the same as in the UX of the "Train" recipe, not lowercase
  • Prediction
    • Prediction length should be replaced by "Forecasting horizon".
    • I would rather use -1 as a default, 0 looks like an error
    • Default quantiles should be 0.1 0.5 0.9 so the user gets "bounds" by default
    • Description of "Include history" could be shortened to "Keep historical data in addition to future values"
  • Misc
    • Recipe should be selectable from dataset as well as folder

mae metric

I think it would be good if the plugin also provided the mae metric.

Train and evaluate recipe > UX enhancements

  • Input / Output
    • it would be more natural to order outputs like this: (i) Trained model folder (ii) Evaluation results (iii) Evaluation forecasts
    • it should be clear which outputs are optional in their label e.g. "Evaluation forecasts (optional)"
    • In the "Evaluation results" output, it should be guaranteed to have an "AGGREGATED" line even if there is only one time series. That way the user will be able to build charts / recipes just to look at AGGREGATED performance.
    • The "AGGREGATED" values could be put on top of the dataset
    • The "Evaluation results" and "Evaluation forecasts (optional) outputs should have an additional column with the session timestamp
  • Input Parameters
    • Should we get rid of "Time granularity step"? It's hard to explain, and it's almost always 1.
    • Should we put Time granularity unit right after Time column?
    • Could we rename "Additional columns" into "Long format" and have it display "Primary key columns" if activated?
    • Could we put "External features" right after Target columns, without visibility condition
  • Modeling
    • "Prediction length" sounds too technical, I would rather keep the original formulation "Forecasting horizon". It's been proven to work well in the R Forecast plugin.
    • After rethinking about it, I think "Forecasting style" should be "AutoML" / "Expert" to match Visual ML....
    • ... But this time AutoML will hide all models buttons. Why? Because analysts don't care and don't know how to choose between Baseline, DeepAR, or Transformer. Let's just activate Baseline, FeedForward and DeepAR by default.
    • If the user switches to expert, it shows the "activate models" buttons, the keywargs MAP parameters and the Training options
  • Advanced can now be removed
    • "Only evaluation" is too complex - personally I would remove it. The recipe is quite fast as it-is, they're no issue in always doing the retraining by default.
    • Number of epochs: I would rather use 10 as default to ensure the default options give reasonably good results.
  • Misc
    • Let's remove the gluonts.model.naive_2 option - it's impossible to understand how it's different from seasonal_naive and the documentation is broken

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.