Comments (6)
Hi @julioasotodv thanks for the detailed example. Somehow the code ran without error on my machine (after I change cv_horizon
to 5). I see you are using the Anaconda environment. What is the python version and pandas/sklearn version?
from greykite.
Hi @KaixuYang, after some more testing (also on Mac, as I was using Windows before) it seems that my pandas version was the issue.
I was using pandas=1.3.3
. After downgrading to pandas 1.2 (for instance, pandas=1.2.5
) the issue goes away and CV results are not NaN anymore, so it looks like some pandas op in greykite\algo\changepoint\adalasso\changepoint_detector.py
does not seem to play well with pandas>=1.3
.
So downgrading to pandas 1.2 solves the issue, which is great to know.
Shall I keep this issue open or perhaps a PR in the meantime for modifying setup.py
accordingly?
Thanks a lot
from greykite.
Hi @julioasotodv thanks for the investigation. Actually the root cause of this issue is that this is a monthly data set, and since the number of potential changepoints are too many, it brings duplicates into the columns. Instead of forcing the pandas version to be 1.2, we would like to fix this issue so it will get along with pandas 1.3 as well. Could you help us submitting a PR to fix this? I think a reasonable fix would be that: in line 230 of greykite.algo.changepoint.adalasso
, the changepoints contains duplicates. We want to eliminate the duplicates with the util function greykite.common.python_utils.unique_elements_in_list
. Could you test if this resolves the problem? Thanks!
from greykite.
Hi @KaixuYang, thank you for reaching out.
I understand the issue. However, there is not change in pandas 1.3 that should yield these two different behaviors as far as I know.
Will try to debug and search in pandas' changelogs before "hardcoding" a greykite.common.python_utils.unique_elements_in_list
patch, just to make sure there would not be any side effects.
Thanks!
from greykite.
Thanks @julioasotodv ! Yeah actually unique_elements_in_list
should be a safe guard for the function even for the earlier versions of pandas, but we failed to put it there. It would be great to add it there so no duplicated columns are generated.
from greykite.
Fixed in the next release.
from greykite.
Related Issues (20)
- Add support for HistGradientBoostingRegressor HOT 2
- Predicted values and metrics for N-step ahead forecast. HOT 5
- Implemening regressors HOT 3
- Importing Forecaster gives error for United States class (Holiday?) HOT 4
- Loosening dill version HOT 2
- GreyKite 0.3.0 Library Issue HOT 7
- Library import issues HOT 1
- ValueError: ``MULTISTAGE_EMPTY`` can not be used without overriding. HOT 2
- Request to release tag 0.5.1 HOT 4
- Unable to get greykite 0.5.0 HOT 4
- Unable to run codes.
- Regressors Already Forecastd, No Lag Needed. But, getting warning "RuntimeWarning: Input data has many null values. Missing 21.45% of one input." HOT 1
- Lower Python version requirement to allow 3.8.10 HOT 1
- `design_info` is needed to make predictions on new data HOT 4
- Dealing with lot of 0s (zeroes) in Greykite Multistage Forecasting HOT 1
- mutable default <class 'greykite.framework.templates.autogen.forecast_config.ModelComponentsParam'> for field SILVERKITE is not allowed: use default_factory HOT 1
- MLFLOW support for the SilverKite Algorithm HOT 1
- Greykite design question
- Version 0.5.0 - broken macOS support with M1 chip due to incorrect version of pmdarima dependency
- Question: How to use sample_weight in grid searching a model HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from greykite.