GithubHelp home page GithubHelp logo

Comments (6)

rebeccabilbro avatar rebeccabilbro commented on May 16, 2024

https://github.com/DistrictDataLabs/yellowbrick/blob/master/yellowbrick/base.py

from yellowbrick.

rebeccabilbro avatar rebeccabilbro commented on May 16, 2024

@bbengfort per your question, in Scikit-learn, TransformerMixin does not appear to extend BaseEstimator

from yellowbrick.

rebeccabilbro avatar rebeccabilbro commented on May 16, 2024

@bbengfort ok, preliminary implementation here: https://github.com/DistrictDataLabs/yellowbrick/blob/develop/yellowbrick/base.py#L27

from yellowbrick.

bbengfort avatar bbengfort commented on May 16, 2024

@rebeccabilbro I think that we can can close this issue based on your implementation, and evolve it as needed after the following suggestions/discussion points:

  • create test cases so that we can add tests in the future to this class
  • discuss changing _draw to draw (the latter being my preference)
  • discuss hooks to the figure/axes in __init__ (See pcoords.py#L81)

For point number three; we can allow the user to give an axes hook called ax or use plt.gca() to get the current axis? I don't think yb needs a hook to the Figure backend.

from yellowbrick.

rebeccabilbro avatar rebeccabilbro commented on May 16, 2024

Ok @bbengfort -
For _draw, that syntax was initially your suggestion; and so far that method is weakly internal, following PEP 8. Are you thinking it will change?

I think having the fig/ax hooks in __init__ will work and is already implemented for the regressors; I'll just need to do a bit of refactoring for the classifiers. We can make that a new issue for technical debt and tag me.

For the test cases, can you explain what you mean and need from me?

from yellowbrick.

bbengfort avatar bbengfort commented on May 16, 2024

draw

In the original comment, it was mostly just a smell; something felt more right about draw() rather than _draw(). After thinking about it some more, I think I've honed in on an answer. First, you're absolutely right, the reason I suggested _draw in the first place was because this seemed like an internal method that shouldn't be called by the user.

However, I'm thinking that draw might be the most critical part of the Visualizer as that's what Visualizers do - they draw things. And even though the user will call poof() to see what's drawn, there is an implicit contract that something will be drawn by the Scikit-Learn integration. This led me to thinking about users creating or customizing their own Visualizers; just as we might extend a Transformer and modify its transform method, I'm beginning to think that users should extend a Visualizer and modify its draw (and possibly poof) methods. Therefore because this is part of the extensible API, it should be draw not _draw.

As an example, my feature transformers will be calling draw either on transform or fit depending on what data they need. This allows the user to conduct some transformations (normalization is required before Parallel Coordinates and RadViz) but also plot the data correctly. Subclasses that might do different things can isolate draw away from these semantics making it easier and more flexible to add smaller changes.

fig/ax hooks

Ok, let's make a new issue, integrate the change, and then I'll refactor the FeatureVisualizers as well! I think we're heading toward a good API!

tests

Just a few simple things here (there isn't a whole lot to test when it comes to base classes):

  • create a test module (tests/test_base.py)
  • import everything from yellowbrick.base
  • create a BaseTests that extends unittest.TestCase (with docstring)
  • potentially add a few simple tests, like assertRaises for NotImplementedErrors

My goal for testing is that we can be confident when we make changes the tests will catch all the hiccups. This has two parts:

  1. Every part of our code has to be imported from our tests; this will catch import errors (as in me changing the module names when you were importing ddlheatmap and will also give us coverage statistics so we know what parts of our code base we're calling.
  2. Executing code that calls as much other code as possible, so that exceptions will be raised if bad stuff is happening.

By having the base test case in place and ready to rock (even if it doesn't have any, or just a few tests) in it, we're getting the import stuff by default, and it also makes it easier for us to add a few tests here and there as we're coding, which will lead to number 2!

from yellowbrick.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.