GithubHelp home page GithubHelp logo

Comments (12)

agramfort avatar agramfort commented on May 4, 2024

+1 on this. I've recently faced the pb.

from scikit-learn.

ogrisel avatar ogrisel commented on May 4, 2024

Then I'll try to play the "Gael Manipulation Trick" (tm):

"If you write the patch, I will do the code review and the merge."

from scikit-learn.

agramfort avatar agramfort commented on May 4, 2024

I should have kept my mouth shut ... however i don't need it now. So you might need to find another volunteer.

from scikit-learn.

GaelVaroquaux avatar GaelVaroquaux commented on May 4, 2024

+1 one for class parameter, -1 for auto as default, and +1 for 'Gael manipulation tricks'.

from scikit-learn.

amueller avatar amueller commented on May 4, 2024

I'm not sure but wouldn't this be inconsistent?
The estimator does not know how many classes there are, right?
So if you initialize with (.2, .8) class weights and give the estimator a 3-class problem, it can only raise an error.
I think this is weird. As far as I understood the sklearn API, constructor parameters don't limit the kind of dataset one can use afterward. Is that not correct?

from scikit-learn.

GaelVaroquaux avatar GaelVaroquaux commented on May 4, 2024

On Wed, Jan 04, 2012 at 11:41:21AM -0800, Andreas Mueller wrote:

So if you initialize with (.2, .8) class weights and give the estimator a 3-class problem, it can only raise an error.
I think this is weird. As far as I understood the sklearn API,
constructor parameters don't limit the kind of dataset one can use
afterward. Is that not correct?

Yes. However, 90% of the use cases of class weights are either to set it
to 'auto', or for people who know how many classes they are working for.

The best rule of thumb that I have for deciding if something needs to be
a fit parameter is: do I need to set it to different values at each
fold in a cross-validation. So, applying this rule, a sample weight can
only be a fit parameter, whereas a class weight should be a class
parameter.

The reason for this rule of thumb is practical: it enables to put models
in something that the 'GridSearchCV'.

Gael

from scikit-learn.

amueller avatar amueller commented on May 4, 2024

Most often this parameter is called "class_weight". I think it should be "class_weights".
More API deprecation warnings?

from scikit-learn.

GaelVaroquaux avatar GaelVaroquaux commented on May 4, 2024

Most often this parameter is called "class_weight". I think it should be "class_weights".

Is it worth breaking the API?

from scikit-learn.

amueller avatar amueller commented on May 4, 2024

I thought it was sometimes called class_weights. But that was only in a test. If it is consistently 'class_weight' then it's not worth breaking the API.
My bad.

from scikit-learn.

agramfort avatar agramfort commented on May 4, 2024

I would stick to class_weight (no s). we usually avoid plural (e.g.
sklearn.linear_model rather than linear_models)

from scikit-learn.

amueller avatar amueller commented on May 4, 2024

In my current PR, I moved the parameter from fit to __init__. Do you think it should be present in both?
This is currently the case for SGDClassifier (and Perceptron?). Shouldn't there be only one way ;)

This still has to be done for some linear_models.

from scikit-learn.

amueller avatar amueller commented on May 4, 2024

Fixed :)

from scikit-learn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.