GithubHelp home page GithubHelp logo

Comments (18)

dashaub avatar dashaub commented on August 21, 2024 1

@mhnierhoff Good news, "forecast" v7.3 was released today and included an argument in forecast.nnetar() to create prediction intervals for nnetar objects. I've modified the code so these are now allowed in the ensemble. Could you grab the development version (0.3.0) of the package from github and see if this has fixed your problem?

from forecasthybrid.

ellisp avatar ellisp commented on August 21, 2024

That does look odd. Can you provide a reproducible example?

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

nnetar models currently do not produce prediction intervals (this is, however, on the roadmap for 2016 in the "forecast" package, see here: http://robjhyndman.com/hyndsight/forecast7-part-2/). For now the upper prediction interval is constructed by taking the highest component model prediction interval and the lowest prediction interval is formed with the lowest model prediction interval. If the forecast from the nnetar component model falls outside the range of these intervals, it is possible for your problem to happen. It's not exactly a bug, but it certainly isn't a feature. I'm just accepting that behavior for now until we get proper prediction intervals.

In general you're probably better off leaving weights = "equal" instead of weights = "insample.error". Neural net models will tend to overfit on the training set and have small errors, so they will get higher weights than they deserve in the ensmble, and this type of problem will manifest itself more.

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

Also in issue #19 we're considering ways of adjusting the prediction intervals. While that wouldn't exactly solve this problem since the nnetar models don't produce prediction intervals, there is probably utility in allowing alternative methods other than the simple max/min for each prediction intervals.

from forecasthybrid.

ellisp avatar ellisp commented on August 21, 2024

For now, perhaps we should make the upper prediction interval as max(Prediction Interval One, Prediction Interval 2...., point estimate 1, point estimate 2) and lower prediction interval the equivalent. So at least the interval is forced to include the highest and lowest point estimates in the case of contributing models that don't have a prediction interval?

from forecasthybrid.

mhnierhoff avatar mhnierhoff commented on August 21, 2024

Hi @ellisp & @dashaub thanks for your quick replies. Although it has now been clarified that it is quasi a bug, below a reproducible example:

suppressPackageStartupMessages(c(
  library(data.table),
  library(lubridate),
  library(xts),
  library(zoo),
  library(magrittr),
  library(plyr),
  library(dplyr),
  library(forecast),
  library(forecastHybrid),
  library(fma)))

####-1-####
beer_Hybrid_Model_1 <- hybridModel( y = beer,
  models = "ns",
  n.args = list(repeats = 5000, size = 25),
  s.args = list(robust = TRUE, biasadj = TRUE),
  weights = "insample.errors", 
  errorMethod = "MASE")

beer_Hybrid_FC_1 <- forecast(beer_Hybrid_Model_1, h = 18)
beer_Hybrid_Acc_1 <- t(accuracy(beer_Hybrid_FC_1))
beer_Hybrid_Acc_1
 Training set

ME 0.21129899
RMSE 3.28243652
MAE 2.39216998
MPE 0.04648739
MAPE 1.69597399
MASE 0.26053336
ACF1 -0.17526242

Plot V1:
v1

####-2-####
beer_Hybrid_Model_2 <- hybridModel(y = beer,
  models = "ns",
  n.args = list(repeats = 5000, size = 25),
  s.args = list(robust = TRUE, biasadj = TRUE))

beer_Hybrid_FC_2 <- forecast(beer_Hybrid_Model_2, h = 18)
beer_Hybrid_Acc_2 <- t(accuracy(beer_Hybrid_FC_2))
beer_Hybrid_Acc_2
 Training set

ME 0.4324798
RMSE 4.1162260
MAE 3.2629172
MPE 0.1657647
MAPE 2.2770645
MASE 0.3553672
ACF1 -0.2103333

Plot V2:
v2

####-3-####
beer_Hybrid_Model_3 <- hybridModel(y = beer,
  models = "ns",
  n.args = list(repeats = 5000, size = 25),
  s.args = list(robust = TRUE, biasadj = TRUE),
  weights = "equal")

beer_Hybrid_FC_3 <- forecast(beer_Hybrid_Model_3, h = 18)
beer_Hybrid_Acc_3 <- t(accuracy(beer_Hybrid_FC_3))
beer_Hybrid_Acc_3
 Training set

ME 0.4319815
RMSE 4.1376981
MAE 3.2780442
MPE 0.1645838
MAPE 2.2877603
MASE 0.3570147
ACF1 -0.2111934

Plot V3:
v3

As you can see the point forecast isn't the mean in all three cases.

Thanks again for your work and your commitment. 👍

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

@ellisp Nice idea

@mhnierhoff I didn't run the first two examples, but the point estimates in the third example look fine to me. Am I missing something?

# All true
beer_Hybrid_FC_3$mean == (beer_Hybrid_Model_3$weights["nnetar"]  * beer_Hybrid_FC_3$nnetar$mean + beer_Hybrid_Model_3$weights["stlm"]  * beer_Hybrid_FC_3$stlm$mean)

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

@ellisp On second thought, there is a fairly large potential downside to this: package stability and user transparency. In cases such as the above models where the bug occurs, the offending model probably isn't the best component model (since its forecast strongly disagrees with the others), so pulling the overall point estimate closer to the others would be beneficial for out-of-sample accuracy. However, I don't like the idea of doing this without documenting this unexpected behavior and at least throwing a warning. Furthermore, once prediction intervals are finally implemented in the "forecast" package, we could remove this temporary measure, but then our point forecasts between the old and new versions would not match, something that is very bad.

It might just be better to accept this idiosyncratic behavior for now. The point forecasts are more important than the prediction intervals, so tampering with those seems a little wrong. This scenario also shouldn't arise too often (more likely when the nnetar model gets a large weight and few other models are used, such as here). Finally, which upper/lower prediction interval do we pull the point forward to? If we're using the default level = c(80, 95), the offending point forecast will be pulled to a different number than if we use level = 70 or level = 80, etc, fan = TRUE. another undesirable characteristic of this approach.

@mhnierhoff Here are the other two models. They look fine too.

# All true
beer_Hybrid_FC_1$mean == (beer_Hybrid_Model_1$weights["nnetar"]  * beer_Hybrid_FC_1$nnetar$mean + beer_Hybrid_Model_1$weights["stlm"]  * beer_Hybrid_FC_1$stlm$mean)

# All true
beer_Hybrid_FC_2$mean == (beer_Hybrid_Model_2$weights["nnetar"]  * beer_Hybrid_FC_2$nnetar$mean + beer_Hybrid_Model_2$weights["stlm"]  * beer_Hybrid_FC_2$stlm$mean)

from forecasthybrid.

mhnierhoff avatar mhnierhoff commented on August 21, 2024

@dashaub Doing it like you for the hybrid model of the first posted screenshot I get this error message:

Error in NextMethod(.Generic) : cannot assign 'tsp' to zero-length vector

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

Do the individual left sides or right sides run fine? If the error is on the right side, which part?

from forecasthybrid.

mhnierhoff avatar mhnierhoff commented on August 21, 2024

@dashaub Sorry, I don't know what you mean by right and left side?
Independent of that, I've recalculated the very first forecast after your package update. Again, the Point Forecast isn't really the mean in my opion and so I calculated the mean from the 95% upper and lower bound and inserted it as a red line into the attached plot.

rplot01_061316_110612_am

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

@mhnierhoff To be clear, as far as I've seen here it is the prediction intervals that are at fault here. The mean forecast is constructed at each forecast horizon by taking the individual models' point forecasts and multiplying by the model's weight. The prediction intervals will have this bug if you use a nnetar model because the nnetar model currently does not produce a prediction interval. Try running your forecasts again with other component models such as aest, and the intervals should appear normal.

from forecasthybrid.

mhnierhoff avatar mhnierhoff commented on August 21, 2024

@dashaub Okay, I got it. But unfortunately I get the best accuracy values when I use a ns model. And I have compared the point forecast of a normal stlf model from the forecast package with the manually calculated hybrid point forecast and they are really close.

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

@mhnierhoff Keep in mind that insample errors are usually a poor measure of how the model will forecast in the future, so using withheld data or cross validation will give a better performance metric; I'd personally never use weights = "insample.errors". If you're already using CV or a withheld test set for selecting the best model and the prediction intervals/point forecast discrepancy still bothers you, you can make the prediction intervals wider by setting level to a larger number, e.g. 99, 99.9, etc until the point forecast eventually is captured in the intervals. They still won't be the "mean" in the sense of being in the middle of the prediction intervals, but they will asymptotically converge to the mean as level approaches 100, and as we know in statistics, asymptotic convergence indicates everything is ok ;)

from forecasthybrid.

mhnierhoff avatar mhnierhoff commented on August 21, 2024

@dashaub Thanks a lot for your helpful replies. 👍

from forecasthybrid.

mhnierhoff avatar mhnierhoff commented on August 21, 2024

@dashaub Hi, thanks for the reminder. I tested it with the code above and unfortunately the "forecast" function won't work. I don't get any error message but it runs infinitely.
I've updated to forecast v7.3 and used your dev version.

from forecasthybrid.

dashaub avatar dashaub commented on August 21, 2024

Rerunning the example above works for me. I changed repeats to 50 since it would be very slow otherwise. Keep in mind running forecast.nnetar() to create prediction intervals will be very slow since it has to run simulations; on my i7 processor this takes about 1 minute with only 50 repeats. The time should scale linearly with repeats, so repeats = 5000 would take about 500 minutes. Try it again with smaller repeats. The results look much better than above.

beer_Hybrid_Model_1 <- hybridModel( y = beer,
  models = "ns",
  n.args = list(repeats = 50, size = 25),
  s.args = list(robust = TRUE, biasadj = TRUE),
  weights = "insample.errors", 
  errorMethod = "MASE")
beer_Hybrid_FC_1 <- forecast(beer_Hybrid_Model_1, h = 18)
plot(beer_Hybrid_FC_1)

rplot

from forecasthybrid.

mhnierhoff avatar mhnierhoff commented on August 21, 2024

Changing from 500 to 50 repeats make also work for me, so everything works now. Thanks a lot for your efforts.

from forecasthybrid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.