Comments (41)
@laresbernardo that works, thanks for the fix!
from robyn.
I am also getting the same error while refreshing the model :
I think the issue is with the robyn_chain function as the "json_new$InputCollect$refreshSourceID" part returns NULL.
Maybe it is not being able to read the correct file.
from robyn.
There are a couple of bugs I've found in refresh functionality and am working on fixes on my end. Can you try updating Robyn with Robyn::robyn_update(ref = "bl01")
and see how it goes? If it goes well I can merge to main afterwards.
from robyn.
your second screenshot might be caused by too few iterations for refresh. can you increase refresh iters and retry? For the other errors, I'm waiting for the result from Bernardo's fixes first.
from robyn.
@shuvayan @bart-vanvlerken we are planning on landing a CRAN stable version this week. Please, let us know asap if the issue persists with the version of the branch I shared so we can move forward or hold it.
from robyn.
@laresbernardo @gufengzhou I'm going to try Bernardo's latest version today and will let you know! Thanks for all the help.
from robyn.
Update. I updated to Bernardo's latest version, and the refresh functionality works again.
However, I'm running into the same error: 'Provided train_size but ts_validation = FALSE. Time series validation inactive.' when I set ts_validation = TRUE in my base model. In addition, I'm getting two new peculiar warning messages (3 & 4, see screenshot below).
In addition, the output tells me it selected 2_69_2 as optimal model and exported it as JSON, but I'm not seeing this model as a onepager in the output folder. When trying to produce the onepager with robyn_csv(), I'm getting the following error:
from robyn.
Thanks for confirming @bart-vanvlerken. So these are all warnings, not errors, which is good because you can refresh the model. about the warnings:
- Intended: The first warning refers to the ts_validation parameter that uses the train_size hyperparameter. If the original model had ts_validation = FALSE, then the refresh model too.
- Intended: The second warning is displaying a warning based on the calibration constraints and recommending more iterations.
- Intended: We haven't be able to replicate but, when the structure changes as in you refresh a model and store it elsewhere (not within the original model's folder) you can get these warnings. It will still work but won't have the origins information. I can see in the logs the rf1 folder was created correctly but it's not inside an original model's folder.
- May or may not be intended: probably a consequence of the previous point (no chain, can't create that data.frame). We could get rid of this one with a quick fix. Checking origin:
Robyn:::refresh_plots_json()
By default, Robyn creates one-pagers for the lowest combined errors models per cluster. The default selected model on refresh is the one with lowest DECOMP.RSSD error (NRMSE not used in the criteria) which not necessarily will match. We could improve this behavior on following versions though, but it's not exactly an error. You can recreate any one-pager with robyn_onepagers()
;)
About the file, did you check for the CSV file created and shown in the log in that specific folder? Wasn't it created?
from robyn.
Hi @laresbernardo, you're right that they are warnings and not errors. However, I did specify ts_validation = TRUE in my base model (and the exported JSON). So I think it's strange that robyn_refresh() does not recognize this, when in the current CRAN version this is not an issue.
Regarding the use of robyn_csv() instead of robyn_onepagers(), that was my bad! Thank you for clearing that up :)
Regarding the exported onepagers in the refresh, it exported four onepagers and the optimal onepager (which was also exported as CSV) was not one of them.
from robyn.
Alright @bart-vanvlerken would you mind updating to my branch once again and retrying? I've just deployed a fix to the ts_validation
issue.
On the one-pagers, as I mentioned before:
By default, Robyn creates one-pagers for the lowest combined errors models per cluster. The default selected model on refresh is the one with lowest DECOMP.RSSD error (NRMSE not used in the criteria) which not necessarily will match. We could improve this behavior on following versions though, but it's not exactly an error. You can recreate any one-pager with robyn_onepagers() ;)
I'll try to add the selected refresh model to the list of exported one-pagers in addition to the clustering soon, and will let you know in this thread.
from robyn.
Hi @laresbernardo, I updated and tested. The ts_validation issue is now resolved! However, the 'optimal' model chosen by the refresh function is still not exported as a onepager. I wanted to recreate this myself since the JSON file was exported correctly. However, when running robyn_recreate() I'm running into the following error:
from robyn.
Alright @bart-vanvlerken, would you mind updating Robyn to my branch (bl01) and retrying? You should get the one-pager for the winning refresh model created by default.
And also, probably have fixed the "penalty" hyperparameters check issue. Please confirm.
from robyn.
Hello @laresbernardo ,
I am getting the below error after updating to your branch:
Recreating model
Imported JSON file successfully: C:/Users/SD/Documents/Robyn_Modular/Robyn_202405021221_init/RobynModel-models.json
>> Running feature engineering...
NOTE: potential improvement on splitting channels for better exposure fitting. Threshold (Minimum R2) = 0.8
Check: InputCollect$modNLS$plots outputs
Weak relationship for: "Audio_i", "Cable_i", "CTV_Hulu_i", "Display_i", "META_i", "Radio_i", "SA360_i", "Snapchat_i", "TikTok_i", "TV_i", "YouTube_i" and their spend
Error in UseMethod("select") :
no applicable method for 'select' applied to an object of class "NULL"
I am using the below code for model refresh:
RobynRefresh <- robyn_refresh(
json_file = json_file,
dt_input = mmm_input_ufpf,
dt_holidays = dt_prophet_holidays,
refresh_steps = 7,
refresh_iters = 1000, # 1k is an estimation
refresh_trials = 1
)
from robyn.
@shuvayan Can't replicate your issue. Maybe it's related to the impressions variables. Would you mind sharing your mmm_input_ufpf CSV and the json file via email to me? It's my user @ gmail.com.
@bart-vanvlerken is it running OK for you?
from robyn.
@laresbernardo , I have shared the email, please check!
from robyn.
Hi @laresbernardo, I used your latest version and the one-pager is exported correctly. However, I'm still not able to reproduce the model with the exported JSON. It gives me the following error:
In addition, I question the decision to optimize refreshing models based on DECOMP.RSSD since it has a poor impact on model accuracy as you can see below.
Finally, I see quite a discrepancy in ROIs between the base model...
... and the refreshed model, despite only adding 1 new observation to the data.
This is quite difficult to communicate to stakeholders, I hope you can have a look at this as well. Thanks for all your help so far!
from robyn.
Imported JSON file successfully: C:/Users/SD/Documents/Robyn_Modular/Robyn_202405021221_init/RobynModel-models.json
@shuvayan it seems you're trying to use a JSON file that is NOT a model. RobynModel-models.json is not a valid JSON file to recreate a model but the whole iterations process. Let me check though if I can replicate the issue..
In addition, I question the decision to optimize refreshing models based on DECOMP.RSSD since it has a poor impact on model accuracy as you can see below.
Finally, I see quite a discrepancy in ROIs between the base model image ... and the refreshed model, despite only adding 1 new observation to the data.
@bart-vanvlerken would you mind opening new threads to discuss these? I may agree with you, but I'd like for @gufengzhou to explain why he decided to change from the combined minimum error to only DECOMP.RSSD and provide some more context.
Also, if you'd like to share with me the model's JSON file and CSV to replicate the issues, please send it so I can try and replicate your issue with robyn_recreate().
from robyn.
Hi @bart-vanvlerken , I've just tested refresh with 4 new datapoints. It looks better than what you showed. You only used 200 refresh_iters, right? That's probably not enough. I used 1k. Regarding decomposition, please look at report_decomposition.png, not the default onepager.
Regarding objectives, refresh still uses both NRMSE and DECOMP.RSSD to optimise. Only the final automated winner selection will rely on decomp, because the challenge of refresh is often the too big changes in decomp.
from robyn.
Hi @gufengzhou, that is odd. In the screenshots above I actually used 3 x 2000 iterations so that convergence was not an issue. However, it still did not converge, perhaps because I based my refresh off of a calibrated model (so MAPE was active as well). Could that also be the reason we're seeing such big differences in model fit?
I only see the left [0] visualization in my report_decomposition.png file for some reason.
@laresbernardo Here is my JSON file, it's basically a configuration that aims to test most functionalities that Robyn has to offer.
RobynModel-1_1020_3.json
And the corresponding CSVs, basically the data that comes with the Robyn package:
clean_data.csv
clean_prophet.csv
from robyn.
(...) However, I'm still not able to reproduce the model with the exported JSON. It gives me the following error:
@bart-vanvlerken I can't seem to be able to reproduce you issue recreating a model. As you can see, with your JSON and CSV I was able to recreate the model with the latest version in branch "bl01":
csv <- read.csv("~/Desktop/clean_data.csv")
json_file <- "~/Desktop/RobynModel-1_1020_3.json"
temp <- Robyn::robyn_recreate(json_file, dt_input = csv)
>>> Recreating 1_1020_3
Imported JSON file successfully: ~/Desktop/RobynModel-1_1020_3.json
>> Running feature engineering...
Input data has 208 weeks in total: 2015-11-23 to 2019-11-11
Initial model is built on rolling window of 104 week: 2016-11-21 to 2018-11-12
>>> Calculating response curves for all models' media variables (5)...
Successfully recreated model ID: 1_1020_3
Warning messages:
1: In check_calibration(dt_input, date_var, calibration_input, dayInterval, :
Your calibration's spend (42,148) for facebook_S between 2018-05-01 and 2018-06-10 does not match your dt_input spend (~14.05K). Please, check again your dates or split your media inputs into separate media channels.
2: In check_calibration(dt_input, date_var, calibration_input, dayInterval, :
Your calibration's spend (2,841) for tv_S between 2018-04-03 and 2018-06-03 does not match your dt_input spend (~947). Please, check again your dates or split your media inputs into separate media channels.
3: In check_calibration(dt_input, date_var, calibration_input, dayInterval, :
Your calibration's spend (67,039) for facebook_S+search_S between 2018-07-01 and 2018-07-20 does not match your dt_input spend (~22.35K). Please, check again your dates or split your media inputs into separate media channels.
from robyn.
@laresbernardo I'm sorry, I shared the base model with you (which works fine for me as well). Could you try to reproduce the refreshed model instead? That's what gave me the error. Here is the JSON:
RobynModel-1_168_7.json
from robyn.
Alright, now I was able to find the problem and fixed it. Thanks.
Would you mind updating to "bl01" again and retrying. It was an issue when recreating a model which used penalties parameter @bart-vanvlerken
from robyn.
I guess in this thread the only pending issue is:
I only see the left [0] visualization in my report_decomposition.png file for some reason.
@gufengzhou would you mind checking this one? Are you able to reproduce? Is it because it doesn't find the original models maybe? Perhaps we should include all past models' information in refreshed models (JSON) instead of the model IDs / chain? I'd leave this improvement as a backlog task for now.
from robyn.
Changes are ready to land in main branch. For the report_decomposition.png
issue, I'd suggest we open a new clean thread. FYI: if an error occurs with the creation of this file, the pipeline won't crash results given it's wrapped with try()
.
Pull Ref: #969 @gufengzhou -> pending review
from robyn.
Landed to main: v3.10.7. Please update and retest. If no issues are reported in a couple of weeks we will land this version as the latest stable version in CRAN. Thanks for the feedback @bart-vanvlerken @shuvayan
from robyn.
Hi @laresbernardo, I just tested refreshing a model on 3.10.7 and I'm getting the following error:
Here is my data so you can reproduce:
RobynModel-3_185_8.json
clean_data_refresh.csv
clean_prophet.csv
from robyn.
Hi @bart-vanvlerken thanks for sharing your files. I was able to replicate the error and have fixed the issue on my end. Would you mind testing with branch "bl02" now and confirming if it's running as expected for you? Update robyn_update(ref = "bl02")
, refresh R session, and retry. After that, we can merge with main branch again, v3.10.7.9000 in a new PR. Thanks!
from robyn.
Hi @laresbernardo the refresh functionality works, but even with one new observation of data it's generating completely different ROAS figures than the base model:
Interestingly, the report_decomposition.png paints a different picture, but the ROAS metrics in this visual do not correspond with the base model that was built.
from robyn.
Hi, regarding the interpretation of refresh, the second plot is related to the file report_aggregated.csv. The concept behind refresh is that it should maintain a stable baseline compared to initial model, but reflect the changes in the new data. You should use report_aggregated.csv and report_decomposition.png to "report" the refresh results.
Let's say you add 4 weeks, then refresh will try to find the best fit and smaller decomp error for the added 4 weeks, while keeping the refresh baseline similar to initial model. It means, in the second plot above, the 2_128_7 [0] is from the initial model (initial modeling window) and 1_450_5 [1] is from the 4 refresh weeks.
The "normal onepagers" are always referring to the entire modelling window that is not exactly relevant for refresh. Hope it makes sense.
from robyn.
Hi @gufengzhou, thanks for your swift reply! There are two things confusing to me that I hope you can clear up:
- You mention the plot below is from the modeling window of the initial model, then why are the ROAS numbers different from the initial model's one-pager (that also report results of the modeling window)?
- You mention the plot below is from the 4 refresh weeks of the refreshed model, why are the ROAS numbers identical to the one-pager of the refresh (which reports results over the entire modeling window)?
from robyn.
Hi, I just spent some time to look through the code base and checked a refresh case. I'm using the latest GitHub version. So far, the ROAS of each model are identical in the onepagers & the report decomp png.
Your result definitely doesn't look right. What caught my eye is that in this comment from you, the report_decomposition.png plot shows the initial model 2_128_7 [0] on the right side, the 1st refresh on the left side. This shouldn't be the case. Not sure what caused this for you. You can see in my example below that the initial model is always on the left side. Maybe you can test run a quick new one with the latest version (no need for full iters), then refresh it to see if this is still the case?
However, you're right about one thing. The ROAS of the 1st refresh (1_71_1 in my case) of the report_decomposition.png shouldn't be identical as its onepager. Now both numbers are the ROAS of the media across entire refresh modeling window, while we actually want to have the ROAS for new periods only for report_decomposition.png. We'll see this gets fixed. Also FYI, compared to ROAS, the effect share (or decomposition) of 1_71_1 in report_decomposition.png is indeed reporting the new period, NOT the entire refresh modeling window.
from robyn.
Hi @gufengzhou, great that you will work on a fix! How strange that your ROAS figures do correspond and mine don't! I tried again using the latest Github main branch and the visualization issue also persists unfortunately.
Here is the model and (demo) data used to come to my findings (note that clean_data_refresh is the full dt_simulated_weekly dataset, where I used a trimmed version to build the base model).
RobynModel-1_119_3.json
clean_data_refresh.csv
clean_prophet.csv
Here is some more additional information:
In addition, I'm getting the following output at the end of the refresh that might have something to do with it:
from robyn.
I found the issue. You're using both ts_validation = TRUE and add_penalty = TRUE, and there's a bug that doesn't pick up the train_size and the penalties when recreating the model.
I've reran a job with ts_validation and penalty both true, then exported json and recreated it using the following code. Now the results are identical, which was not the case before.
json_path <- "/Users/gufengzhou/Desktop/Robyn_202405141527_init/RobynModel-1_136_7.json"
RobynRecreated <- robyn_recreate(
json_file = json_path,
dt_input = dt_simulated_weekly,
dt_holidays = dt_prophet_holidays,
quiet = FALSE)
InputCollectX <- RobynRecreated$InputCollect
OutputCollectX <- RobynRecreated$OutputCollect
get_json <- read_json(json_path)
get_json_tab <- bind_rows(sapply(get_json$ExportedModel$summary, function(x) as.data.frame(t(as.matrix(unlist(x))))))
OutputCollectX$xDecompAgg %>%
select(solID, rn, xDecompAgg) %>%
left_join(
select(get_json_tab, rn = variable, xDecompAgg_json = decompAgg) %>%
mutate(xDecompAgg_json = as.numeric(xDecompAgg_json)),
by = "rn")
Can you please check? It's on the branch fix_recreate_with_penalty
from robyn.
Hi @gufengzhou, I've updated to the branch you mentioned and I noticed that the refresh function does not work on recreated models: when I use the JSON file generated by recreating the base model (as done in the demo script) it gives me the following error:
The JSON of the base model itself works, which implies that these files representing the same model are in some way different.
Unfortunately the main issue is not solved for me either - in report_decomposition.png the visualizations are still twisted and the ROAS figures do not correspond with the one-page of the base model. Could it be something else?
from robyn.
This is very strange. I've just tested robyn_refresh again and it works. See my screenshot. Also the report_decomposition.png looks correct. Are you 100% certain that you've got the right branch?
Also, can you verify if this script gives you the identical result?
I also see that you're having the "Must provide 'hyperparameters' in robyn_inputs..." error again. Not sure why I don't have it. I do notice that whenever I'm using your json to test robyn_recreate, I don't get identical decomp between json and recreated model. But when I use jsons exported from latest package version, I get identical results. I think if you can get identical results from the script mentioned above, then we're one step closer.
from robyn.
@gufengzhou I will try your solution, in the meanwhile I'd like to send you my R code so you can try to reproduce - can I mail it to you?
from robyn.
sure. [email protected]
from robyn.
@gufengzhou I'm positive i've used your version, as you can see below:
Your script gives me identical results, but I'm afraid we are comparing different things: you are comparing the base model JSON with the recreated model, while I was comparing the base model JSON with the recreated base model JSON. In order to produce the recreated base model JSON, I used the alternative approach that's mentioned in the demo script (since robyn_recreate() does not do so automatically):
The model JSON generated by this approach does not match, as you can see below, and generates an error when used in robyn_refresh():
Have you had any luck reproducing my errors with the code I sent you over email? I hope we can resolve this!
from robyn.
Alright...you're obviously a very thorough person:) You're looking at differences at the 5th digits. I'm gonna let this one count as matched. There're just several rounding steps that I don't even remember that might be the reason for this discrepancy. But I'm quite certain this is acceptable for vast majority of use cases. I did another comparison across original object, original json, recreated object and recreated json. I'd count them as identical. I'll check your refresh error bit later.
from robyn.
That would be great, because a client is not able to refresh their model currently (on the CRAN version nor the dev version). I've sent you the data + JSON over mail so you can inspect at your convenience!
from robyn.
After a quick check, I can't recreate the same model result from your json from the email (before we get to the refersh issue). I noticed that the Json is created with a previous version, so I kind of suspect this is the cause. Would you please try to remodel it with the latest version, select a comparable candidate as your old one, then use the new Json to check recreate & refresh? Please understand that this package is still undergoing constant development, so backwards compatibility is not always possible, even tho we try.
from robyn.
Related Issues (20)
- Cannot execute remotes::install_github("facebookexperimental/Robyn/app") HOT 4
- Robyn one pager is not getting exported HOT 2
- Is there a way to get model results into a csv so that it can be fed to PowerBI to create responsive charts HOT 5
- Issue when I run the model with 12 to 14 variables in dev (3.10.6001 ) HOT 4
- Suggestion: Cleaner RobynLog HOT 2
- `nrmse_n` must be size 40040 or 1, not 40032. HOT 2
- Subsequent Robyn Refresh Penalty Hyperparameters HOT 5
- Input Data: Quarterly Data HOT 1
- Warning message: In .font_global(font, quiet = FALSE, ...) : Font(s) "Arial Narrow" not installed, with other name, or can't be found HOT 2
- Can we optimize budget allocation based on KPIs (Key Performance Indicators)? Additionally, I'd like to understand how the mean value is calculated in the budget allocator code. HOT 3
- CRAN package version outdated (still at 3.10.3 instead of 3.10.5) HOT 1
- No function named v2t HOT 1
- Is there a way to turn off saturation curves and force linearity? HOT 2
- Budget allocator for future dates HOT 1
- k=1 and k=2 not allowed in clustering results
- Invalid Connection Error
- Budget Allocation Outcomes in Robyn
- Can the ROAS for individual ad platform in budget allocation be considered as incremental ROAS? HOT 1
- Getting Error in signif(nevergrad_hp_val[[co]][index], 6) : non-numeric argument to mathematical function while running demo.R script
- Robyn_refresh produces models which are way different from the original
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from robyn.