GithubHelp home page GithubHelp logo

tophat / codewatch Goto Github PK

View Code? Open in Web Editor NEW
38.0 8.0 3.0 691 KB

[deprecated] Monitor and manage deeply customizable metrics about your python code using ASTs

Home Page: https://codewatch.io

License: Apache License 2.0

Python 92.14% JavaScript 5.40% CSS 1.67% HTML 0.79%
python opensource code-metrics abstract-syntax-tree

codewatch's Introduction

This project is currently deprecated and may be archived. If you're looking for something similar, you could try bellybutton or writing a custom checker in pylint instead.

Overview

Monitor and manage deeply customizable metrics about your python code using ASTs.

Codewatch lets you write simple python code to track statistics about the state of your codebase and write lint-like assertions on those statistics. Use this to incrementally improve and evolve the quality of your code base, increase the visibility of problematic code, to encourage use of new patterns while discouraging old ones, to enforce coding style guides, or to prevent certain kinds of regression errors.

What codewatch does:

  1. Traverses your project directory
  2. Parses your code into AST nodes and calls your visitor functions
  3. Your visitor functions run and populate a stats dictionary
  4. After all visitor functions are called, your assertion functions are called
  5. Your assertion functions can assert on data in the stats dictionary, save metrics to a dashboard, or anything you can think of

Installation

Python: 2.7, 3.6, 3.7

Execute the following in your terminal:

pip install codewatch

Usage

codewatch codewatch_config_module

codewatch_config_module is a module that should contain your visitors, assertions and filters (if required)

Visitors

You should use the @visit decorator. The passed in node is an astroid node which follows a similar API to ast.Node

from codewatch import visit


def _count_import(stats):
    stats.increment('total_imports_num')

@visit('import')
def count_import(node, stats, _rel_file_path):
    _count_import(stats)

@visit('importFrom')
def count_import_from(node, stats, _rel_file_path):
    _count_import(stats)

This will build a stats dictionary that contains something like the following:

{
    "total_imports_num": 763
}

Assertions

Once again in the codewatch_config_module you can add assertions against this stat dictionary using the @assertion decorator

from codewatch import assertion


@assertion()
def number_of_imports_not_too_high(stats):
    threshold = 700
    actual = stats.get('total_imports_num')
    err = 'There were {} total imports detected which exceeds threshold of {}'.format(actual, threshold)
    assert actual <= threshold, err

In this case, the assertion would fail since 763 is the newStat and the message:

There were 763 total imports detected which exceeds threshold of 700

would be printed

Filters

You can add the following optional filters:

  1. directory_filter (defaults to skip test and migration directories)
# visit all directories
def directory_filter(_dir_name):
    return True
  1. file_filter (defaults to only include python files, and skips test files)
# visit all files
def file_filter(_file_name):
    return True

Tune these filters to suit your needs.

Contributing

See the Contributing docs

Contributors

Thanks goes to these wonderful people emoji key:

Josh Doncaster Marsiglio
Josh Doncaster Marsiglio

💻
Rohit Jain
Rohit Jain

💻
Chris Abiad
Chris Abiad

💻
Francois Campbell
Francois Campbell

💻
Monica Moore
Monica Moore

🎨
Jay Crumb
Jay Crumb

📖
Jake Bolam
Jake Bolam

🚇
Shouvik D'Costa
Shouvik D'Costa

📖
Siavash Bidgoly
Siavash Bidgoly

🚇
Noah Negin-Ulster
Noah Negin-Ulster

💻
Vardan Nadkarni
Vardan Nadkarni

💻
greenkeeper[bot]
greenkeeper[bot]

🚇
Kazushige Tominaga
Kazushige Tominaga

💻

We welcome contributions from the community, Top Hatters and non-Top Hatters alike. Check out our contributing guidelines for more details.

Credits

Special thanks to Carol Skelly for donating the 'tophat' GitHub organization.

codewatch's People

Contributors

allcontributors[bot] avatar cabiad avatar chrono avatar dependabot[bot] avatar francoiscampbell avatar greenkeeper[bot] avatar jakebolam avatar jcrumb avatar lime-green avatar noahnu avatar rohitjain avatar sdcosta avatar syavash avatar tooooooooomy avatar vardan10 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

codewatch's Issues

An in-range update of docusaurus is breaking the build 🚨

The devDependency docusaurus was updated from 1.6.2 to 1.7.0.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

docusaurus is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details
  • ci/circleci: website: Your tests failed on CircleCI (Details).
  • ci/circleci: python-27: Your tests passed on CircleCI! (Details).
  • ci/circleci: python-36: Your tests passed on CircleCI! (Details).
  • ci/circleci: python-37: Your tests passed on CircleCI! (Details).

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

bulk assertions generator

In a large codebase, we saw that we were writing lots of similar assertions that compared stats in a certain namespace to a dict representing the expected results. Seemed like it might be unnecessary boilerplate. Maybe there are easy ways we can help kill it.

Project deprecated?

@lime-green / @rohitjain

I've been focusing on other things than this project for a while now and haven't seen many changes coming though from other contributors, including for the auto-detected security issues.

I'm curious about your thoughts on whether it's time to mark this project as deprecated.

Support explicit ordering of visitor functions rather than purely relying on declaration order

Is your feature request related to a problem? Please describe.

Ensuring a strict ordering on the execution of "visitors" is necessary if we want to guarantee deterministic assertions. If visitor_a increments counter_a for some node of type X and visitor_b increments counter_b for node of type X only if counter_a is greater than 0, then changing the ordering that visitor_a and visitor_b are called will change the final value of the computed statistics, meaning assertions may differ from the run that used different ordering.

While typing out this issue I was trying to come up with a good use case for having one visitor function depend on another. I could not. Perhaps someone else can think of an example?

Even if we were to consider this an anti-pattern, we should attempt to minimize possibility of error / flakiness.

Changing to a reducer-style approach for visitors may encourage inter-visitor dependencies as per #10.

Currently order of execution of visitors is defined by declaration order, purely b/c we use a central registry with a global array that is appended to whenever the @visit decoration is executed. The use of a global registry seems like an anti-pattern, however in order to remove this registry, we'd need some way to guarantee visitor order (dir(module) sorts alphabetically, while module.__dict__.keys() is only guaranteed to be ordered in py3.6+).

Describe the solution you'd like

A pattern similar to https://pytest-dependency.readthedocs.io/en/latest/about.html#what-is-the-purpose, where you essentially mark the names of tests that must be run first. We can thus use a topological sort of the dependencies.

e.g.

@visit(nodes.FunctionDef)
def some_func(node, .., ..):
  pass

@visit(nodes.FunctionDef, predicate=None, depends_on=['some_func', 'some_other_func'])
def count_funcs(node, .., ..):
  pass

We'd examine "depends_on" to generate a graph and then topological sort. We could sort alphabetically for visitors with no dependencies (or that are tied in sorting order).

Describe alternatives you've considered

  • Use inspect module to identify line numbers of items in dir(). Use this to order. Pro: consistent in all versions of python. Con: probably slow (?) and will grow in complexity when your config is spread across multiple files.
  • Rely on module.dict which is sorted by declaration order in py3.6+ and should remain in whatever arbitrary order it ends up in pre-3.6 (i.e. re-running program should keep same order even if not declaration order). Con: behaviour isn't easy to understand pre-3.6.

Additional context

https://tophat-opensource.slack.com/archives/CE14KJGET/p1543705988030800

Public example app/repo using the codewatch

Is your feature request related to a problem? Please describe.
It would be great to see something like thm config that is public, so it's easy for people to understand how to use this project.

Describe the solution you'd like
A publicly available example/repo/project on how to use codewatch.

Describe alternatives you've considered
n/a

Additional context
n/a

Document package release process

Is your feature request related to a problem? Please describe.

Missing documentation on core repo processes, mainly "release" process. What constitutes a release? Are there any requirements? Semantic versioning?

Describe the solution you'd like

Documentation added (and maybe a "process" / issue label specifically for questions around repo policies).

Keeping a `ast.NodeVisitor` compatible API may have performance limitations

See #1 (comment)

Another problem with the NodeVisitor API is that it couples registering a visitor subclass (batch of related node visitors) to walking (and visiting) the AST.

It's not clear to me whether re-walking each AST for each subclass would scale reasonably if we had dozens or hundreds of registered subclasses. In other words, would we linearly increase total execution time for a given set of ASTs or would some level of caching dominate, making the subsequent AST walks negligible.

If we see performance problems, a clear optimization path to consider is to walk once and call each visitor (method) registered for that node type

Use pip-tools to fully pin all dependencies

Mostly just an idea for now.

Thinking about / reading this: https://hynek.me/articles/python-app-deps-2018/

We don’t really have deployment needs or anything, but having actually-repeatable-CI would still be pretty nice and I think would require all deps (even implicit ones) to be pinned.

pip tools is a nice way to maintain the difference between the explicit and implicit deps while still having them fully pinned

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

Improve `visit()` decorator's argument syntax

See #1 (comment)

It's not clear to me what the normalization here (adding an initial cap to the passed in string param) adds to our API.

Maybe node_name should have to be a valid node class name?

I might also consider making the param to visit the actual class, but the boilerplate of importing everything you need seems painful. Would give us the comfy feeling of an enumeration though.

Don't traverse hidden directories/files by defaults

Is your feature request related to a problem? Please describe.

The default directory and file filters do not capture some basic cases.

Describe the solution you'd like

Add rules to default directory & file filters to ignore any file or directory that begins with a ".", excluding current directory (this captures ".git")

Describe alternatives you've considered

By default (or via opt-out option outside of default dir/file filter system), parse and then ignore any files/dirs that match a .gitignore rule.

Additional context

Alternative is now possible to implement due to #94

Consider and explore redux-style reducer approach for stats

I tend to agree that following the stats flow through the code is a little unclear. Maybe this would help.

See #1 (comment)

just a random comment: what about exploring a redux-style reducer approach for stats, where a visitor would receive the node and the top-level stats tree as arguments and return a new stats object?

An in-range update of all-contributors-cli is breaking the build 🚨


🚨 Reminder! Less than one month left to migrate your repositories over to Snyk before Greenkeeper says goodbye on June 3rd! 💜 🚚💨 💚

Find out how to migrate to Snyk at greenkeeper.io


The devDependency all-contributors-cli was updated from 6.14.2 to 6.15.0.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

all-contributors-cli is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details
  • ci/circleci: python-27: CircleCI is running your tests (Details).
  • ci/circleci: python-36: CircleCI is running your tests (Details).
  • ci/circleci: website: Your tests passed on CircleCI! (Details).
  • ci/circleci: python-37: Your tests failed on CircleCI (Details).

Release Notes for v6.15.0

6.15.0 (2020-05-24)

Features

  • contribution-types: add missing contribution types (#261) (bcc0d99)
Commits

The new version differs by 9 commits.

  • bcc0d99 feat(contribution-types): add missing contribution types (#261)
  • e987eb0 chore(package): update cz-conventional-changelog to version 3.0.0 (#198)
  • 4573e29 docs: add AnandChowdhary as a contributor (#219)
  • 33e1a43 chore(package): update semantic-release to version 16.0.0 (#242)
  • 77923a3 docs: add kharaone as a contributor (#212)
  • b5d85de chore(package): update git-cz to version 4.1.0 (#243)
  • e2ed91d docs: add ilai-deutel as a contributor (#257)
  • d26cd47 docs: add MarceloAlves as a contributor (#222)
  • 9a6cf19 chore(package): update kcd-scripts to version 5.0.0 (#246)

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

Use filename instead of module path for the config module

Is your feature request related to a problem? Please describe.
When calling codewatch myu_codewatch_config_module.py, we are greeted with an ImportError: No module named py. If you know that codewatch interprets config_module.py as a Python module path, you know to remove the .py, but if you don't it's an annoying interface.

Describe the solution you'd like
In the event that #85 is not accepted, or if it is accepted with a manual override for the config file, make the CLI use the filename of the config module rather than a Python module path.

Use well-known file for the config module

Is your feature request related to a problem? Please describe.
The CLI command feels verbose compared to other tools.

Describe the solution you'd like
Like make, invoke, and eslint, use a well-known filename, perhaps codewatch.py as a default config module filename. This would make the CLI simpler, with the potential for just running codewatch and nothing more. This is related to #12 in that the CLI could become `codewatch .

We could preserve the ability to specify the config module under a named CLI parameter, such as codewatch --config <path to config file>.

Describe alternatives you've considered
No major alternatives, this feels like either it's changed or not.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.