GithubHelp home page GithubHelp logo

facebookexperimental / robyn Goto Github PK

View Code? Open in Web Editor NEW
1.0K 56.0 315.0 208.2 MB

Robyn is an experimental, AI/ML-powered and open sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. Our mission is to democratise modeling knowledge, inspire the industry through innovation, reduce human bias in the modeling process & build a strong open source marketing science community.

Home Page: https://facebookexperimental.github.io/Robyn/

License: MIT License

R 9.99% Dockerfile 0.02% JavaScript 0.16% CSS 0.04% MDX 1.94% Python 0.19% Jupyter Notebook 87.67%
marketing-mix-modeling marketing-mix-modelling mmm marketing-science econometrics adstocking cost-response-curve budget-allocation hyperparameter-optimization evolutionary-algorithm

robyn's Introduction

Robyn: Continuous & Semi-Automated MMM

The Open Source Marketing Mix Model Package from Meta Marketing Science

CRAN_Status_Badge Downloads Site Facebook CodeFactor

Introduction

  • What is Robyn?: Robyn is an experimental, semi-automated and open-sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. It uses various machine learning techniques (Ridge regression, multi-objective evolutionary algorithm for hyperparameter optimization, time-series decomposition for trend & season, gradient-based optimization for budget allocation, clustering, etc.) to define media channel efficiency and effectivity, explore adstock rates and saturation curves. It's built for granular datasets with many independent variables and therefore especially suitable for digital and direct response advertisers with rich data sources.

  • Why are we doing this?: MMM used to be a resource-intensive technique that was only affordable for "big players". As the privacy needs of the measurement landscape evolve, there's a clear trend of increasing demand for modern MMM as a privacy-safe solution. At Meta Marketing Science, our mission is to help all businesses grow by transforming marketing practices grounded in data and science. It's highly aligned with our mission to democratizing MMM and making it accessible for advertisers of all sizes. With Project Robyn, we want to contribute to the measurement landscape, inspire the industry and build a community for exchange and innovation around the future of MMM and Marketing Science in general.

Quick start for R

1. Installing the package

  • Install Robyn latest package version:
## CRAN VERSION
install.packages("Robyn")

## DEV VERSION
# If you don't have remotes installed yet, first run: install.packages("remotes")
remotes::install_github("facebookexperimental/Robyn/R")
  • If it's taking too long to download, you have a slow or unstable internet connection, and have issues while installing the package, try setting options(timeout=400).

  • Robyn requires the Python library Nevergrad. If encountering Python-related error during installation, please check out the step-by-step guide as well as this issue to get more info.

  • For Windows, if you get openssl error, please see instructions here and here to install and update openssl.

2. Getting started

  • Use this demo.R script as step-by-step guide that is intended to cover most common use-cases. Test the package using simulated dataset provided in the package.

  • Visit our website to explore more details about Project Robyn.

  • Join our public group to exchange with other users and interact with team Robyn.

  • Take Meta's official Robyn blueprint course online

Quick start Python (Robyn API for Python beta)

The Robyn API for Python (beta), first released on Nov.22nd 2023, is a plumber-based solution that requires the installation of the Robyn R pacakge first. Please see the usage guide here.

License

Meta's Robyn is MIT licensed, as found in the LICENSE file.

Contact

robyn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

robyn's Issues

Unable to use the model_id from previous models

Issue

As mentioned in previous issues comments, I was trying to use one of the Model_id from previous models since it was providing better results. However, it says "not the best model" and "error: model cant be found".Furthermore, Is there any way to recreate the results as in one of the previous model results?

Steps to Reproduce

Tried to put local bounds as that mode(fixed all hyperparameters). Although, some confusion on this part. An example would help.

Expected Behavior

Should be able to utilise that model result and apply to the budget allocator

Actual Results

image

Environment

R version (R --version)

Technical paper/documentation

Contributing to FB NextGen MMM R script

Issue

I wanted to hear, if there's a technical paper regarding Robyn like the Facebook Prophet (here)

What's the main model? How is it created? What are the parameters etc. etc

Or is the way simply to understand the modelling process to sit down, open the code, and run it through step by step?

I know there's the feature-tab about e.g ridge-regression, but I struggle to find anything of how each feature, transformation etc. work together to create the final Robyn model.

Reach equation (Michaelis-Menten)

Hi team, have been reviewing the Robyn methodology behind the reach calculation and I have one suggestion and one question you might find useful:

  1. How is the model taking into account the penetration (max reach) that is inherent to the media channel? e.g. OOH penetration much higher than Social.
    Think you might need a penetration parameter (%) that varies by channel and multiplies the adstocked media to get the reach, rather than using the 'MediaCostFactor' that nothing really tells you about the size of your target audience.

  2. In this line of code:
    nlsStartVal <- list(Vmax = dt_spendModInput[, max(reach)/2], Km = dt_spendModInput[, max(reach)])

Shouldn't be the other way around?
Vmax = max(reach) and Km = max(reach)/2 ?
Given how the Michaelis-Menten equation is defined:https://en.wikipedia.org/wiki/Michaelis%E2%80%93Menten_kinetics

Thank you!

Make predictions

Question about prediction

Hi guys, is there a predict function or a way to make predictions given the media spend? I know that f.budgetAllocator function return optmResponseUnitTotal as a total of each media, but what about intercept, season, holidays, and others variables?

Actionable insights

The graphs provided and optimisation of spend across channels is great - How would you use this to iterate your media spend over a period of time? Perhaps your creatives change one month and one channel changes effectiveness (decay and saturation may be more global)? Should you just have a rolling time period for the model dataset?

Search spend

Hi,

Media spend for Search is endogenous as it’s impacted by all other advertising and control variables. How should we estimate the impact of Paid search via this model?

thanks for any help!

# higher intercept value for the outcome

捕获

Hi Team,

Sorry for having too many questions. I used my own data to run Robyn code. But as the result, the intercept is too high almost 100% and each variable has 0% coefficient. I tried 3 different data sets and almost have same questions. Could you please tell me how to fix it? For example, maybe change the parameters or lambda in function R file?

Best,

in selectChildren(ac[!fin], -1) : error 'No child processes' in select

Hi,

Issue

I have encountered some unexpected errors.

Steps to Reproduce

Please provide instructions that can be used to reproduce your issue. Usually
this will include a test case that produces the wrong output.

Expected Behavior

For the graphs etc., to plot.

Actual Results

Though there were some errors after the 1st trial, Robyn did complete all 40, at which time more 'errors' occurred. Both are pictured below. It does mention NaN numbers in the data - could this be because one of my variables (the % increase/decrease in retail sales, m-o-m, y-o-y etc) occasionally has a negative number? We don't have competitor sales, like in your example, so I've been improvising.

Screenshot 2021-03-27 at 16 10 12

Screenshot 2021-03-27 at 16 10 22

Environment

R Studio

Thanks again,

Lee

Decomposition vs. Prediction

Hi Robyn team,

Thanks for the great package ! I have been experimenting with it and seeing positive results. My question is somewhat related to #79 . For that issue resolution, you mentioned that Robyn is supposed to be used as a decomposition tool rather than a prediction tool. I think it would be useful to have some predictive functionality in the model. My questions are the following:

  1. How do we ensure that the model is reliable (i.e. validate the model) and that we can trust its recommendations ? In classic ML approaches, we answer this question based on prediction error on hold out data. In the absence of this predictive functionality in Robyn, what approaches do you recommend? P.S. - This is a critical issue to get buy-in when requesting increased budgets :)

  2. In #79 , you mentioned that it is controversial how to best provide future dataframe for intercept/trend/season/other baselines. Could you shed some light on that ? What are the issues ?

Again, thanks for this wonderful package and looking forward to future releases.

Unexpected error on model_output_collect <- f.robyn()

Issue

When we try to execute the model, at the end of the first iteration we find the following error:

Captura de pantalla 2021-04-13 a las 17 07 04

Apparently the issue is on the class of the data we are trying to run, because the error says "x is not a data.table or data.frame" but we are not sure about that.

Looking forward on reading your comments.

Environment

R 4.0.3

f.mmmRobyn() not working

Hi,

While running fb_nextgen_mmm_v20.0.exec.r code:

model_output <- f.mmmRobyn(set_hyperBoundGlobal
                          ,set_iter = set_iter
                          ,set_cores = set_cores
                          ,epochN = Inf # set to Inf to auto-optimise until no optimum found
                          ,optim.sensitivity = 0 # must be from -1 to 1. Higher sensitivity means finding optimum easier
                          ,temp.csv.path = getwd() # output optimisation result for each epoch. Use getwd() to find path
                          )

I'm recieving the following error with traceback. This error shows after the iteration process finishes.

 Error in { : task 6 failed - "'from' must be a finite number" 
7.
stop(simpleError(msg, call = expr)) 
6.
e$fun(obj, substitute(ex), parent.frame(), e$data) 
5.
foreach(i = 1:iterRS, .export = c("f.adstockGeometric", "f.adstockWeibull", 
    "f.transformation", "f.rsq", "f.decomp", "f.calibrateLift", 
    "f.lambdaRidge", "f.refit"), .packages = c("glmnet", "stringr", 
    "data.table"), .options.snow = opts) %dopar% { ... at fb_nextgen_mmm_v20.0.func.R#755
4.
system.time({
    doparCollect <- foreach(i = 1:iterRS, .export = c("f.adstockGeometric", 
        "f.adstockWeibull", "f.transformation", "f.rsq", "f.decomp", 
        "f.calibrateLift", "f.lambdaRidge", "f.refit"), .packages = c("glmnet",  ... at fb_nextgen_mmm_v20.0.func.R#754
3.
f.mmm(hyper_bound_global, iterRS = set_iter, set_cores = set_cores, 
    out = out) at fb_nextgen_mmm_v20.0.func.R#1003
2.
system.time({
    model_output <- f.mmm(hyper_bound_global, iterRS = set_iter, 
        set_cores = set_cores, out = out)
}) at fb_nextgen_mmm_v20.0.func.R#1002
1.
f.mmmRobyn(set_hyperBoundGlobal, set_iter = set_iter, set_cores = set_cores, 
    epochN = Inf, optim.sensitivity = 0, temp.csv.path = getwd()) 

Can you please help?

f.robyn() syntax error

Hi,

When running the exec code. The modelling part.

model_output_collect <- f.robyn(set_hyperBoundLocal
                                ,optimizer_name = set_hyperOptimAlgo
                                ,set_trial = set_trial
                                ,set_cores = set_cores
                                ,plot_folder = getwd()) # please set your folder path to save plots. It ends without "/".

I'm recieving this error with traceback right away:

Error in py_module_import(module, convert = convert) : 
  SyntaxError: invalid syntax (typing.py, line 254) 
5.
stop(structure(list(message = "SyntaxError: invalid syntax (typing.py, line 254)", 
    call = py_module_import(module, convert = convert), cppstack = NULL), class = c("Rcpp::exception", 
"C++Error", "error", "condition"))) 
4.
py_module_import(module, convert = convert) 
3.
import("nevergrad") at fb_robyn.func.R#708
2.
f.mmm(set_hyperBoundLocal, set_iter = set_iter, set_cores = set_cores, 
    optimizer_name = optmz) at fb_robyn.func.R#1156
1.
f.robyn(set_hyperBoundLocal, optimizer_name = set_hyperOptimAlgo, 
    set_trial = set_trial, set_cores = set_cores, plot_folder = getwd()) 

Cannot run optimizer

Trying to run budget allocator function and getting this error:

Error in names(channel_constr_low) <- set_mediaVarName :
'names' attribute [6] must be the same length as the vector [1]

I am giving the function the following:
f.budgetAllocator(modID = "2_16_3",
,optim_algo = "MMA_AUGLAG"
,expected_spend = NULL
,expected_spend_days = NULL
,channel_constr_low = 0.5
,channel_constr_up = 2
,scenario = "max_historical_response"
,maxeval = 100000
,constr_mode = "eq")

if channel_constr_low is an integer, why is the function applying names() to it ?

Thanks!

Federico

Using conversions

Is there a way to project(forecast) the conversions based on the budget allocator recommendations?

A summary of what the issue is about.

Is there a way to project(forecast) the conversions based on the budget allocator recommendations? As budget allocator tells you the initial conversions one can expect so can we replicate that for the next few months? I am looking at the file 1_1_1 reallocated and pareto_aggregated but having a hard time understanding how to mean spend/mean responses, initSpendUnitTotal,initResponseUnit are calculated then maybe I can do that. Although I looked at the function code and I was able to find some but any input or recommendation is deeply appreciated

Steps to Reproduce

Please provide instructions that can be used to reproduce your issue. Usually this will include a test case that produces the wrong output.

Expected Behavior

What you expected to happen. For example the error message you expected to see.

Actual Results

What actually happened. For example, an error message you did not expect to see.

Environment

R 4.03

There is a bug when I try Quick start.

Issue

There is a bug when I try Quick start.
I run fb_nextgen_mmm_v20.0.exec.R.
In Line139,
dt_mod <- f.inputWrangling()
I got
Error in f.inputWrangling() : week start has to be Monday or Sunday In addition: Warning messages: 1: In value[[3L]](cond) : default start value for nls out of range. using c(1,1) instead 2: In value[[3L]](cond) : default start value for nls out of range. using c(1,1) instead

Then I command
dt_mod
I got
Error: object 'dt_mod' not found

Steps to Reproduce

  • run fb_nextgen_mmm_v20.0.exec.R

Expected Behavior

  • dt_mod is made

Actual Results

  • dt_mod is not made

Environment

platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 6.0 year 2019 month 04 day 26 svn rev 76424 language R version.string R version 3.6.0 (2019-04-26) nickname Planting of a Tree (R --version)

Error on AWS EC2

All Socket Connections are in use

Issue

I've created an EC2 instance with 8 cores and 32GB RAM. I've also installed all the required packages and Anaconda3 etc..
I've also assigned 6 cores for set cores. When I run the model(below one) it's throwing below error. Tried lot of things from all the side.

model_output_collect <- f.robyn(set_hyperBoundLocal
,optimizer_name = set_hyperOptimAlgo
,set_trial = set_trial
,set_cores = set_cores
,plot_folder = "")

Error:

r in socketAccept(socket = socket, blocking = TRUE, open = "a+b", :
all connections are in use

Can't run f.inputWrangling()

I am trying to run fb_robyn.exec.R and I get this error message in this line of code:

Prepare input data

dt_mod <- f.inputWrangling()

The error message says:

Error in value[3L] : unused argument (cond)

And the traceback:

  1. tryCatchOne(expr, names, parentenv, handlers[[1L]])
  2. tryCatchList(expr, classes, parentenv, handlers)
  3. tryCatch({
    dateCheck <- as.matrix(dt_transform[, set_dateVarName, with = F])
    dateCheck <- as.Date(dateCheck)
    }, error = function() { ... at fb_robyn.func.R#79
  4. f.inputWrangling()

I saw the same error in another issue, but with no solution so I am re-asking. Thanks very much

Any sugguestion on Synergy analysis?

Hi, sorry to bother you again.

Sometimes, we want to analysis cross channel effect, can i just add a total spend of multiple channels as another x?
do you have any suggestion on this? Many thanks.

Bias (ie non zero average residuals) for pareto-optimal models with example data

Contributing to FB NextGen MMM R script

Issue

The residual plots in documentation and produced when I run the example locally show a non-zero average residual for all output models.

Steps to Reproduce

Follow the example in the documentation

Expected Behavior

Ideally, 'optimal models' would have near-zero average error in prediction. This particularly important for business users that will test the results of a recommended budget allocation against the model's prediction.

With that in mind, the question becomes if it is necessary, and if so how shall we, adjust the response coming out of the biased model prior to running the optimization.

Can't run f.inputWrangling()

Issue

Im trying to Run fb_nextgen_mmm_v20.0.exec.R and get this error Message:

Error in value[3L] : unused argument (cond)

When showing the Traceback I get:

  1. tryCatchOne(expr, names, parentenv, handlers[[1L]])
  2. tryCatchList(expr, classes, parentenv, handlers)
  3. tryCatch({
    dateCheck <- as.matrix(dt_transform[, set_dateVarName, with = F])
    dateCheck <- as.Date(dateCheck)
    }, error = function() { ...
  4. f.inputWrangling()

Environment

R version (R --version)

Nevergrad

Hi,

Could you please explain the ask and tell commands in Nevergrad package. I saw you said "Nevergrad allows us to optimize the explore and exploit balance through the ask and tell commands, in order to perform a multi-objective optimization the balances out the Normalized Root Mean Square Error (NRMSE) and decomp.RSSD ratio (Relationship between spend share and channels coefficient decomposition share) providing a set of Pareto optimal model solutions" What the ask and tell commands stands for?
Thanks

Best,
Regina

Error in qunif(hyppar_for_qunif, min(channelBound), max(channelBound))

Hello,

Thanks for a nice implementation of MMM. I installed it successfully, but when I'm running the provided example, I'm getting this error (R 4.0.4; Win 10)

Running trial nr. 1 out of 40 ...

Running 500 iterations with evolutionary algorithm on geometric adstocking, 15 hyperparameters, 100 -fold ridge x-validation using 6 cores...

Working with: DiscreteOnePlusOne
| | 0%Error in qunif(hyppar_for_qunif, min(channelBound), max(channelBound)) :
Non-numeric argument to mathematical function
Called from: qunif(hyppar_for_qunif, min(channelBound), max(channelBound))

how to add media channel without spend variable

Issue

I'd like to be able to add a media channel whose only the "exposure" data is available as the media channel does not have easy to quantify variable costs (e.g. acquisition emails/other_direct_marketing_comms to an owned prospect list).

My workaround solution would be to use the channel exposure as a baseline variable --> downside: not able to apply adstock and diminishing return transformations to this channel.

Is there any best-practice/solution you would recommend for this use-case?

Steps to Reproduce

NA

Expected Behavior

NA

Actual Results

NA

Environment

R version (R --version)

Error running f.robyn()

Issue

Error when running model_output_collect: " Error in { : task 1 failed - "'from' must be a finite number"
image

Steps to Reproduce

  • custom dataset
  • str(set_hyperBoundLocal)= list of 33

Expected Behavior

I expect the model to run

Actual Results

see error message above

Environment

R version (R 4.0.3)

Error in f.plotTrainSize(f)

Hi,

Great to see there has been a big update. Hopefully the below is a quick fix? :-)

Issue

There seems to be a bug with the new code.

Steps to Reproduce

I have just tried to run the set code and example data. No changes were made.

Expected Behavior

I expected the model to run.

Actual Results

Unfortunately I got this error;

Error in f.plotTrainSize(f) :
could not find function "f.plotTrainSize"

Environment

R Studio

Thanks,

Lee

Holiday as dummy?

Hi guys, even after including the holidays file you supplied, I am still having big underpredictions in xmas week (xmas peaks are quite noticeable).

I tried adding a dummy variable in xmas week in the baseline group of variables but the model drops it (coef=0).
What is the correct way to add a dummy variable? or, is there a way to make the prophet holiday capture these peaks?

Thanks very much,

F

Convert Single Plot to Two

Issue

Is there a way to split the model one pager and/or budget allocator to multiple plots. I want to look at 10 spending variables and all of them in a single plot are cluttered. If not, do you have suggestions on how to start working on additional code that could do that?

Unable to run the budget allocator as it gives me warning about coefficients. How can we handle this? Thanks!

Issue

Unable to run the budget allocator as it gives me warning "3 of my coefficients are 0". How can we handle this? Thanks!

A summary of what the issue is about.

Running the last step to create a budget for the marketing channels after choosing the best modID that represents our business. However, even after formatting the data, changing variables and tweaking the code its giving me error that it cant run and cant find the "
Error in chnName %in% modNLSCollect$channel :
object 'modNLSCollect' not found "

Steps to Reproduce

modNLSCollect$channel
optim_result <- f.budgetAllocator(modID = "1_78_4"
,scenario = "max_historical_response"
,channel_constr_low = c(0.7,0.75,0.60,0.8,0.75,0.77,0.87,0.9,0.88,
0.78,0.87,0.65,0.88,0.87),
channel_constr_up = c(1.5, 1.5, 1.5, 2, 1.5, 1.5,1.5,1.5,2
,1.5,1.5,1.5,1.5,1.5) # not recommended to )

Please provide instructions that can be used to reproduce your issue. Usually
this will include a test case that produces the wrong output.

NA

Expected Behavior

Should run the optimizer

Actual Results

please seem image
image

What actually happened. For example, an error message you did not expect to see.
Please seee image
image

Environment

R version (R 4.04)

how to calculate avg effect and spend share?

Hi thanks for your great work.
Here is my question:
It seems that the avg spend share is not the real spend share (spend by channel / total spend).
can you share the formular to calculate avg effect and spend share? Thanks.

Model cant find some baseline variables in the data

Issue

I have few variables which are like exposure/impact data such as whether a person got effected by Covid or not which in turn is a factor 0- No and 1- Yes. Same with other variable where a launch of website had exposure or not. At present I am injecting them as baseline variable and having the training data 0.60 where Bhatta coefficient is the highest in my data. However, in the last step while running the model it seems that it cant find these columns. I even tried changing their name to match the data naming convention but still it gets stuck at columns cant find. Please see the screenshot attached.

Steps to Reproduce

image

Expected Behavior

Should run the model as expected

Actual Results

image

Environment

R version (R 4.04)

Please tell me the definition of the column of de_simulated_data.csv

This is a question, not an issue. I'm sorry if I shouldn't ask a question here. Please tell me in the right place.

background

I want to know what data is needed for an analysis like quick start.

question

Please tell me the definition of the column of de_simulated_data.csv.
I think the following, is it correct?

column_name description
DATE The day the row was observed
revenue revenue of the day
tv_S Investment in that channel
ooh_S Investment in that channel
print_S Investment in that channel
facebook_I Investment in that channel
search_clicks_P what is this?
competitor_sales_B what is this?
facebook_S  what is this?

I would like to do this analysis on the data of the company I belong to. :)

Deprecating normalization of decayed independent variables

Hello! I was checking out this code (and have been very impressed!). I noticed that for the f.transformation function, "step 2: normalize decayed independent variable" has been deprecated. Would you be able to shed some light on this? Thank you!

ROI scale difference in Conversion and Revenue

Issue

Why are ROI values very different for revenue vs conversion? I assumed it is a calculation comparing effect share and spend share. I see similar values for spend vs effect share; but mean roi is 0.005 for conversion, while its 1.2 for conversion with same spend variables.

How to resume learning?

Issue

I accidentally stopped running in the middle of learning. Is there a way to resume learning?

Steps to Reproduce

I kept learning for about 16 hours and accidentally stopped it ;(
image

Expected Behavior

I don't want to waste these 16 hours. Is there a way to resume learning?

In other words, is there a way to restart learning from parameters that have been learned to some extent?
It is like resuming learning of a neural network from the middle.

# Issue

Hi,

Hope you are doing well,

When I run the code to "dt_mod <- f.inputWrangling()", there is an error that "Error in nlsModel(formula, mf, start, wts) :
singular gradient matrix at initial parameter estimates". Could you please tell me how to fix it and reason? Thank you so much

Best,

Media excluded in optimiser because their coeffients are 0

Issue

I would like to know, how can I avoid zero coefficients in the training process?.

Steps to Reproduce

My hyperparameters are:

image

Expected Behavior

I would like to have non zeros coefficients, in google search brand or other media channel.

Actual Results

image

Environment

R version (R --version)

issue with building wheel and installing nevergrad and dependancies

Issue

error due to reticulate not finding python version and installing in the proper python version the bayesian-optimization dependency and the nevergrad

Steps to Reproduce

when running the

model_output_collect <- f.robyn(set_hyperBoundLocal
                                ,optimizer_name = set_hyperOptimAlgo
                                ,set_trial = set_trial
                                ,set_cores = set_cores
                                ,plot_folder = paste(script_path,"chart", sep = ''))

Expected Behavior

import(nevergrad)
successful with also the bayesian-optimization dependency

Actual Results

WARNING: Building wheel for bayesian-optimization failed: [Errno 13] Permission denied: '/Users/stefanoalli/Library/Caches/pip/wheels/14'
Installing collected packages: numpy, threadpoolctl, scipy, joblib, scikit-learn, typing-extensions, cma, bayesian-optimization, nevergrad
   Running setup.py install for bayesian-optimization: started
   Running setup.py install for bayesian-optimization: finished with status 'done'
 DEPRECATION: bayesian-optimization was installed using the legacy 'setup.py install' method, because a wheel could not be built for it. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.

Environment

R version 4.0.3 (2020-10-10)
Macbook pro

Error "all connections are in use" during training of quickstart example

Issue

When trying the quickstart example with the latest version, I get an error during the processing.
I've seen something similar once and it was related to parallel computing. Hence, I tried setting numCores to 1. That didn't help here.

Steps to Reproduce

Running the Quickstart example without changes.

Expected Behavior

No error.

Actual Results

Running trial nr. 1 out of 40 ...

Running 500 iterations with evolutionary algorithm on geometric adstocking, 15 hyperparameters, 100 -fold ridge x-validation using 1 cores...
C:\Users\a1474c5\Software\ANACON1\envs\R-RETI1\lib\site-packages\nevergrad\parametrization\data.py:257: UserWarning: Bounds are 1.0 sigma away from each other at the closest, you should aim for at least 3 for better quality.
warnings.warn(

Working with: DiscreteOnePlusOne
|===================================== | 25%

Error in file(con, "w") : all connections are in use

file(con, "w")
8.
writeLines(input, f)
7.
system(cmd, wait = FALSE, input = "")
6.
makePSOCKcluster(names = spec, ...)
5.
makeCluster(cores)
4.
registerDoParallel(cores = set_cores) at fb_robyn.func.R#972
3.
system.time({
for (lng in 1:iterNG) {
nevergrad_hp <- list()
nevergrad_hp_val <- list() ... at fb_robyn.func.R#925
2.
f.mmm(set_hyperBoundLocal, set_iter = set_iter, set_cores = set_cores,
optimizer_name = optmz) at fb_robyn.func.R#1314
1.
f.robyn(set_hyperBoundLocal, optimizer_name = set_hyperOptimAlgo,
set_trial = set_trial, set_cores = set_cores, plot_folder = "../reports/figures")

Environment

sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] grid parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] rstudioapi_0.13 reticulate_1.18 rPref_1.3 minpack.lm_1.2-1 nloptr_1.2.2.2
[6] PerformanceAnalytics_2.0.4 xts_0.12.1 zoo_1.8-8 see_0.6.2 ggpubr_0.4.0
[11] gridExtra_2.3 ggplot2_3.3.3 prophet_0.6.1 rlang_0.4.10 Rcpp_1.0.6
[16] StanHeaders_2.21.0-7 car_3.0-10 carData_3.0-4 glmnet_4.1-1 Matrix_1.3-2
[21] doParallel_1.0.16 iterators_1.0.13 foreach_1.5.1 lubridate_1.7.10 stringr_1.4.0
[26] data.table_1.14.0

loaded via a namespace (and not attached):
[1] matrixStats_0.58.0 insight_0.13.1 rstan_2.21.2 tools_4.0.4 backports_1.2.1 utf8_1.1.4 R6_2.5.0 DBI_1.1.1
[9] lazyeval_0.2.2 colorspace_2.0-0 withr_2.4.1 prettyunits_1.1.1 tidyselect_1.1.0 processx_3.4.5 curl_4.3 compiler_4.0.4
[17] cli_2.3.1 bayestestR_0.8.2 scales_1.1.1 quadprog_1.5-8 ggridges_0.5.3 callr_3.5.1 rappdirs_0.3.3 foreign_0.8-81
[25] rio_0.5.26 extraDistr_1.9.1 pkgconfig_2.0.3 readxl_1.3.1 shape_1.4.5 generics_0.1.0 jsonlite_1.7.2 dplyr_1.0.4
[33] zip_2.1.1 inline_0.3.17 magrittr_2.0.1 loo_2.4.1 parameters_0.12.0 munsell_0.5.0 fansi_0.4.2 abind_1.4-5
[41] lifecycle_1.0.0 stringi_1.5.3 pkgbuild_1.2.0 plyr_1.8.6 forcats_0.5.1 crayon_1.4.1 lattice_0.20-41 haven_2.3.1
[49] splines_4.0.4 hms_1.0.0 ps_1.6.0 pillar_1.5.0 igraph_1.2.6 ggsignif_0.6.1 effectsize_0.4.3 codetools_0.2-18
[57] stats4_4.0.4 glue_1.4.2 V8_3.4.0 RcppParallel_5.0.3 vctrs_0.3.6 cellranger_1.1.0 gtable_0.3.0 purrr_0.3.4
[65] tidyr_1.1.2 assertthat_0.2.1 openxlsx_4.2.3 broom_0.7.5 rstatix_0.7.0 survival_3.2-7 tibble_3.0.6 ellipsis_0.3.1

Error in f.budgetAllocator(modID = "3_14_4", scenario = "max_historical_response", :

Hi again :-)

Issue

I have tested Robyn with one my own client's data (admittedly it's lacking important variables). Robyn completes the 40 trials no problem, but then struggles with the

Steps to Reproduce

Same setup as Github code, just with my own data (columns for fb spend and imp, search spend and clicks, and content spend, as well as a variable column for the Covid infection rate in Norway, which I had to hand).

Expected Behavior

I expected the budget optimiser to push out some slides.

Actual Results

I got the below:

Running budget allocator for model ID 3_14_4 ...
Error in f.budgetAllocator(modID = "3_14_4", scenario = "max_historical_response", :
channel_constr_low & channel_constr_up have to contain either only 1 value or have same length as set_mediaVarName

Environment

R Studio

Thanks again,

lee

Ridge Regression - where to apply priors?

Hi Team, regarding ridge regression, was wondering if it's possible to choose what prior (weight) to apply to each of the variables? Would be useful to force in variables that the model dropped from the model (coef=0) Thanks

'f.mmmCollect' error

Hi,

Hope you're well.

Issue

I'm running Robyn with my own data. This is the 'same' as example data, aka, a CSV with date, revenue and data from 3 digital channels (with spend/imp numbers,) plus a variable column for unemployment rates. Unfortunately I am getting this error message after no further hyperparameter optimisations can be found;

Error in f.mmmCollect(set_hyperBoundLocal) : object 'epochN' not found

Steps to Reproduce

To confirm, I have not altered he code in ay way, but the below is turned on (though I get the same error regardless):

  • activate_hyperBoundLocalTuning <- T # change setChannelBounds = T when setting bounds for each media individually

However, the below is turned off for now, as a single epoch was taking 3 hours. With the example data, this can be turned on and all the graphs and workings were plotted, after multiple epochs, after a few hours:

  • activate_calibration <- F # Switch to TRUE to calibrate model. This takes longer as extra validation is required

Expected Behavior

I expected it to plot all the chosen graphs, as well as the optimiser, like when I tested the example data.

Environment

RStudio

Hope you can help!

Tanks,
Lee

Is there a way to make the charts interactive?

Issue

Due to variation in the data values, some charts are hard to read. Is there a way to make them interactive or formatting charts to make some values easy on the eyes?

Steps to Reproduce

Please see the screenshot below.

Expected Behavior

Should be able to interact or able to focus on specific variable

Actual Results

What actually happened. For example, an error message you did not expect to see.
image

Environment

R version (R --version)

Questions

Question:

Could you please why the variables include spending and non-spending in the model together? I can understand if you put all the spending variables or non-spending variables in the model but cannot understand use both of them. Can you please explain for me. Thanks

Unable to run f.inputWrangling()

Within fb_robyn.exec.R I am getting this error message while trying to run

dt_mod <- f.inputWrangling()

The error message says:

Error in loadNamespace(name) : there is no package called ‘rstan’

Error traceback:

13.stop(cond)
12.doWithOneRestart(return(expr), restart)
11.withOneRestart(expr, restarts[[1L]])
10.withRestarts(stop(cond), retry_loadNamespace = function() NULL)
9.loadNamespace(name)
8.getNamespace(ns)
7.asNamespace(ns)
6.getExportedValue(pkg, name)
5.rstan::optimizing
4.do.call(rstan::optimizing, args)
3.fit.prophet(m, df, ...)
2.prophet(recurrance, holidays = if (use_holiday) {
holidays[country == set_country]
} else {
NULL ... at fb_robyn.func.R#471
1.f.inputWrangling()

Is the rstan library required? I didn't see it in the library list

Feature Idea: log-link models

Contributing to FB NextGen MMM R script

Issue

Current decomp function only allows for linear GLM. Would be particularly relevant for modelling count data (ie with granular enough data on conversion events, we might get enough zeros to need want to use a poisson regression)

Steps to Reproduce

Follow the documentation

Likely steps to add the function

At a minimum, the decomp function needs to be re-written to account for logged models. I'm unsure how deep into the optimization functions changes would have to be made. It seems like the sign constrained, regularization estimation can still be done with the glmnet() function.

R version (R --version)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.