Comments (12)
+1 on this. I've recently faced the pb.
from scikit-learn.
Then I'll try to play the "Gael Manipulation Trick" (tm):
"If you write the patch, I will do the code review and the merge."
from scikit-learn.
I should have kept my mouth shut ... however i don't need it now. So you might need to find another volunteer.
from scikit-learn.
+1 one for class parameter, -1 for auto as default, and +1 for 'Gael manipulation tricks'.
from scikit-learn.
I'm not sure but wouldn't this be inconsistent?
The estimator does not know how many classes there are, right?
So if you initialize with (.2, .8) class weights and give the estimator a 3-class problem, it can only raise an error.
I think this is weird. As far as I understood the sklearn API, constructor parameters don't limit the kind of dataset one can use afterward. Is that not correct?
from scikit-learn.
On Wed, Jan 04, 2012 at 11:41:21AM -0800, Andreas Mueller wrote:
So if you initialize with (.2, .8) class weights and give the estimator a 3-class problem, it can only raise an error.
I think this is weird. As far as I understood the sklearn API,
constructor parameters don't limit the kind of dataset one can use
afterward. Is that not correct?
Yes. However, 90% of the use cases of class weights are either to set it
to 'auto', or for people who know how many classes they are working for.
The best rule of thumb that I have for deciding if something needs to be
a fit parameter is: do I need to set it to different values at each
fold in a cross-validation. So, applying this rule, a sample weight can
only be a fit parameter, whereas a class weight should be a class
parameter.
The reason for this rule of thumb is practical: it enables to put models
in something that the 'GridSearchCV'.
Gael
from scikit-learn.
Most often this parameter is called "class_weight". I think it should be "class_weights".
More API deprecation warnings?
from scikit-learn.
Most often this parameter is called "class_weight". I think it should be "class_weights".
Is it worth breaking the API?
from scikit-learn.
I thought it was sometimes called class_weights. But that was only in a test. If it is consistently 'class_weight' then it's not worth breaking the API.
My bad.
from scikit-learn.
I would stick to class_weight (no s). we usually avoid plural (e.g.
sklearn.linear_model rather than linear_models)
from scikit-learn.
In my current PR, I moved the parameter from fit to __init__
. Do you think it should be present in both?
This is currently the case for SGDClassifier (and Perceptron?). Shouldn't there be only one way ;)
This still has to be done for some linear_models.
from scikit-learn.
Fixed :)
from scikit-learn.
Related Issues (20)
- SelectKBest.fit and fit_transform do not work with y=None HOT 3
- Unexpected behavior of sklearn.feature_selection.mutual_info_regression if copy=False HOT 8
- BUG building the documentation HOT 3
- the documentation says that the min_samples parameter specifies the number of neighbors including the point itself, but does not actually include HOT 4
- Bad rendering of the badge links in the README.rst file on github HOT 4
- ⚠️ CI failed on Wheel builder (last failure: Apr 11, 2024) ⚠️ HOT 7
- Race condition when building with Meson
- RFC Trigger a copy when copy=False and X is read-only HOT 8
- tree.export_graphviz numpy error HOT 2
- BUG?: PCA output changed in 1.5 HOT 2
- mean_squred_error giving wrong results HOT 2
- Provide examples on how to customize the scikit-learn classes HOT 6
- scikit-learn cannot be built with OpenMP support. HOT 1
- Meson does not fully build the project in one go and need to be run twice ? HOT 3
- Version 1.0 breaks cross-validation with string targets HOT 2
- Make it possible to specify `monotonic_cst` with feature names in all tree-based estimators HOT 1
- ⚠️ CI failed on Ubuntu_Jammy_Jellyfish.pymin_conda_forge_openblas_ubuntu_2204 (last failure: Apr 19, 2024) ⚠️ HOT 3
- `parametrize_with_checks` fails if custom estimator implements `__call__`
- ⚠️ CI failed on macOS.pylatest_conda_mkl_no_openmp (last failure: Apr 20, 2024) ⚠️ HOT 3
- BUG: Issue building from source on MacOS Python 3.11 HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scikit-learn.