GithubHelp home page GithubHelp logo

mcompetitions / m5-methods Goto Github PK

View Code? Open in Web Editor NEW
557.0 557.0 221.0 975.24 MB

Data, Benchmarks, and methods submitted to the M5 forecasting competition

R 0.01% Jupyter Notebook 96.23% Python 3.77%

m5-methods's People

Contributors

mcompetitions avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

m5-methods's Issues

Took long time to run the benchmarking code

Just curious, how long did it take to run the benchmarking code (assuming it's running in parallel)?

I have been trying to run the code on my Mac Desktop(12cores, 30G Memory) but it's been running almost a day.

iMAPA does not return forecast

iMAPA does not return forecast. Should we add return function at the end of iMAPA function ?

iMAPA <- function(x, h){
  mal <- round(mean(intervals(x)),0)
  frc <- NULL
  for (al in 1:mal){
    frc <- rbind(frc, rep(SexpS(as.numeric(na.omit(as.numeric(rollapply(tail(x, (length(x) %/% al)*al), al, FUN=sum, by = al)))), 1)/al, h))
  }
  forecast <- colMeans(frc)
  return(forecast)

}

Auto.arima model with external variable failed

I used the dataset from Kaggle and tried to reproduce the benchmark results. It took quite a while to run through the forecasts. But it seems to fail when fitting the "auto.arima" model with external variables. Any ideas?

arimax_f <- forecast(auto.arima(insample_top, xreg=as.matrix(head(x_var, length(insample_top)))), h=28, xreg=as.matrix(tail(x_var, 28)))$mean #ARIMA with external variables
Error in auto.arima(insample_top, xreg = as.matrix(head(x_var, length(insample_top)))) :
No suitable ARIMA model found

Rounding of benchmark forecasts?

Hi,

I have a question about the rounding of the statistical benchmarks.
For example I see that the forecast in the first series for ES_bu is:

[1] 0.9829610 1.1039467 0.9006081 0.9184879 1.0185274 1.3266206
[7] 1.0978010 0.9829610 1.1039467 0.9006081 0.9184879 1.0185274
[13] 1.3266206 1.0978010 0.9829610 1.1039467 0.9006081 0.9184879
[19] 1.0185274 1.3266206 1.0978010 0.9829610 1.1039467 0.9006081
[25] 0.9184879 1.0185274 1.3266206 1.0978010

However, I am not able to find, in the Point Forecasts - Benchmarks.R file, any places where the forecast will be rounded.
I am assuming that in a business context (since it is WalMart data) that we have to round the numbers as we cannot give fractional numbers for planning purposes?

may you tell us, when winning code will be published in this repo or other place?

may you tell us, when winning code will be published in this repo or other place?

as it is written in https://mofc.unic.ac.cy/m5-competition/
Reproducibility
The prerequisite for winning any prize will be that the code used for generating the forecasts, with the exception of companies providing forecasting services and those claiming proprietary software, will be put on GitHub, not later than 14 days after the end of the competition (i.e., the 14th of July, 2020).

“NAs introduced by coercion”

Hi, when I run the code, the following error appears;
Error in starting_period:nrow(sales_train): NA/NaN argument
Warning message in eval(expr, envir, enclos):
“NAs introduced by coercion”

I think this is the line that has problems:
sales_train <- as.numeric(ex_sales[,6:ncol(ex_sales)])

I changed it to:
sales_train <- as.numeric(ex_sales[,7:ncol(ex_sales)])

But I'm not sure if it's ok to make this change.

Thanks and Regards,
John Taco

computation and time needed for benchmarks (and winners submissions)

Hi there, first of all, much thanks for all the effort and insights resulting from this competition
(now deep diving on findings paper). Amazing work and contribution!

I was looking for one thing couldn't find so far, would be possible to know or have an idea of compute and time needed by the benchmarks and winning submissions? In practice, it's a relevant dimension for evaluating different approaches.

Example: if I understood properly for exp smooth bottom up, fit was run ~30k times? (number of time series at maximum level). From the code done in parallel I think, but still, prob takes some time.

Would be great to get any info on this.

thanks!
(from https://github.com/Mcompetitions/M5-methods/blob/60829cf13c8688b164a7a2fc8c4832cc216bdbec/validation/Point%20Forecasts%20-%20Benchmarks.R)

Question about the uncertainty prediction part - U3

Hello,

I'm working on the U3 submission and I have some questions about the uncertainty prediction part.

In the documentation he said:

"To get predictions for all 9 quantiles (median + 8 other ) we can simply multiply every median to a coefficient ; this coefficient was calculated per level by minimizing loss over the last 28 known days (public LB)."

But I didn't find any function relating to minimisation in his scripts; there are only the raw coefficients.
Does anyone know how did he makes the minization? Did he learn thanks to the public LB evaluation metric?
I also wonder why in some of his get_ratio() functions he go through log(qs/(1-qs)) with qs = [0.005,0.025,0.165,0.25, 0.5, 0.75, 0.835, 0.975, 0.995]

Thanks in advance :)

Empty benchmark files?

Hi All,

I was just looking to run some of the benchmarks outlined in the competitor guide.

But the 'Point Forecasts - Benchmarks.R' and 'Probabilistic Forecasts - Benchmarks.R' files in the validation folder look to be empty?

Will these be uploaded anytime soon? Would be really helpful if so!

Best,

Shaheen

GitHub LFS Exceeded, Can you please resolve this?

Cloning into 'down_Mcompetitions_M5-methods__1616416066'...
remote: Enumerating objects: 537, done.
remote: Counting objects: 100% (537/537), done.
remote: Compressing objects: 100% (422/422), done.
remote: Total 537 (delta 104), reused 527 (delta 104), pack-reused 0
Receiving objects: 100% (537/537), 337.90 MiB | 12.20 MiB/s, done.
Resolving deltas: 100% (104/104), done.
Checking out files: 100% (491/491), done.
Downloading validation/calendar.csv (112 KB)
Error downloading object: validation/calendar.csv (568d0fe): Smudge error: Error downloading validation/calendar.csv (568d0fe5f41790142379698732908e4e57432c1c6396f3f59fb880a9c2b54231): batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

Errors logged to /data/github-download/temp/down_Mcompetitions_M5-methods__1616416066/.git/lfs/objects/logs/20210322T202828.45866023.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: validation/calendar.csv: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry the checkout with 'git checkout -f HEAD'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.