Light

dataiku / dss-plugin-timeseries-forecast Goto Github PK

View Code? Open in Web Editor NEW

18.0 22.0 7.0 784 KB

Dataiku DSS plugin to automate time series forecasting with Deep Learning and statistical models 📈

Home Page: https://www.dataiku.com/product/plugins/timeseries-forecast/

License: Apache License 2.0

Makefile 1.70% Python 98.30%

dataiku dss-plugin forecast forecasting time-series gluonts deep-learning arima deepar transformer

dss-plugin-timeseries-forecast's Introduction

Time Series Forecast Plugin (deprecated)

This Dataiku DSS plugin provides recipes to forecast multivariate time series from year to minute frequency with Deep Learning and statistical models.

Documentation: https://www.dataiku.com/product/plugins/timeseries-forecast/

⚠️ Starting with DSS version 11 this plugin is considered as "deprecated", we recommend using the native time series forecasting features.

Release notes

See the changelog for a history of notable changes to this plugin.

License

This plugin is distributed under the Apache License version 2.0.

dss-plugin-timeseries-forecast's People

Stargazers

Watchers

Forkers

biningv jnarhan alexcombessie shrutic-git redaaffane itayshal ktgross15

dss-plugin-timeseries-forecast's Issues

Time-Series Forecast Plug-In - Deep Learning Models Issue - 'Batch Size'

Hi!

This is just perhaps a minor bug / issue which we may be facing. We have downloaded the latest Forecast plug-in , version 1.2.0 from the following link : https://cdn.downloads.dataiku.com/public/dss-plugins/timeseries-forecast/

However, with forecasting mode as 'Auto-ML' and for all other modes, we encounter the error where in the 'batch size' parameter for the Deep-Learning models is not recognized and it pops the error, thereby rendering an unsuccessful run of the plug-in.

Is this an inherent plug-in issue or perhaps something configurable on our end.

Browser: Chrome
DSS version 9.0.4
Python Version 3.6

Thanks!

Writing training output to HDFS fails with 'Pathname [path] is not a valid DFS filename.'

Describe the bug
Writing training output to HDFS fails with 'Pathname [path] is not a valid DFS filename.'

To Reproduce
Steps to reproduce the behavior:

Select a dataset.
Click the forecast plugin.
Select the training option (number 1).
Select an HDFS connection for the metrics dataset.

Expected behavior
Plugin writes output to specified locations.

Root cause
HDFS doesn't allow colons in paths: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/model.html#Paths_and_Path_Elements

Suggested fix
In dss-plugin-timeseries-forecast/custom-recipes/timeseries-forecast-1-train-evaluate/recipe.py, change line 29 from
session_name = datetime.utcnow().isoformat() + "Z"
to
session_name = datetime.utcnow().isoformat().replace(':', '.') + "Z"

Feature importance

Is your feature request related to a problem? Please describe.
Unable to know which feature is contributing to my dependent variable

Describe the solution you'd like
Feature importance chart will help us to understand which features to keep in model and which one to remove

Describe alternatives you've considered
No alternatives considered as I am stuck and dont know to interpret the results

Additional context
Add any other context or screenshots about the feature request here.

Input / Ouput
- Description of external features dataset: "timeseries identifiers..." > "primary key columns if long format was used"
Model selection
- Performance metric should be MASE by default, it's more common than sMAPE, and as robust
- What does ND mean?
- Typo in "Mean weighted Qauntile Loss"
- Model manual selection: model names should be the same as in the UX of the "Train" recipe, not lowercase
Prediction
- Prediction length should be replaced by "Forecasting horizon".
- I would rather use -1 as a default, 0 looks like an error
- Default quantiles should be 0.1 0.5 0.9 so the user gets "bounds" by default
- Description of "Include history" could be shortened to "Keep historical data in addition to future values"
Misc
- Recipe should be selectable from dataset as well as folder

mae metric

I think it would be good if the plugin also provided the mae metric.

Train and evaluate recipe > UX enhancements

Input / Output
- it would be more natural to order outputs like this: (i) Trained model folder (ii) Evaluation results (iii) Evaluation forecasts
- it should be clear which outputs are optional in their label e.g. "Evaluation forecasts (optional)"
- In the "Evaluation results" output, it should be guaranteed to have an "AGGREGATED" line even if there is only one time series. That way the user will be able to build charts / recipes just to look at AGGREGATED performance.
- The "AGGREGATED" values could be put on top of the dataset
- The "Evaluation results" and "Evaluation forecasts (optional) outputs should have an additional column with the session timestamp
Input Parameters
- Should we get rid of "Time granularity step"? It's hard to explain, and it's almost always 1.
- Should we put Time granularity unit right after Time column?
- Could we rename "Additional columns" into "Long format" and have it display "Primary key columns" if activated?
- Could we put "External features" right after Target columns, without visibility condition
Modeling
- "Prediction length" sounds too technical, I would rather keep the original formulation "Forecasting horizon". It's been proven to work well in the R Forecast plugin.
- After rethinking about it, I think "Forecasting style" should be "AutoML" / "Expert" to match Visual ML....
- ... But this time AutoML will hide all models buttons. Why? Because analysts don't care and don't know how to choose between Baseline, DeepAR, or Transformer. Let's just activate Baseline, FeedForward and DeepAR by default.
- If the user switches to expert, it shows the "activate models" buttons, the keywargs MAP parameters and the Training options
Advanced can now be removed
- "Only evaluation" is too complex - personally I would remove it. The recipe is quite fast as it-is, they're no issue in always doing the retraining by default.
- Number of epochs: I would rather use 10 as default to ensure the default options give reasonably good results.
Misc
- Let's remove the gluonts.model.naive_2 option - it's impossible to understand how it's different from seasonal_naive and the documentation is broken

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble