GithubHelp home page GithubHelp logo

near-facsimile's People

Contributors

msuchane avatar

Watchers

 avatar

near-facsimile's Issues

Show progress

The complete run might take a long time. It would be useful to display the current file number out of the total number of comparisons, each time when printing out similar files.

Find similar titles

Given a module title, find modules in the repository that already have a similar title.

This might be useful when deciding whether to write a new module or reuse an existing one.

Graph output

Present the results as a graph for better readability and interpretation. The natural candidate for this graph would be a histogram.

Either draw the graph internally, or provide instructions for how to plot it with external tools.

One issue here is that accurate numbers are only available above the threshold that the user configures. Possible solutions:

  • Only plot the data above the threshold. Probably the cleanest solution, because it scales with the threshold and the user can opt in for 0% threshold to get the full graph.
  • Fill the rest of the graph with some sort of mock data for the remaining files below the threshold.
  • Use trigram data below the threshold. even though it's inaccurate and usually overlaps with the accurate data.

Ignore certain lines in files

Add an option where the user can specify regular expressions. The tool then skips all lines in files that match the regular expressions.

This is useful for ignoring code comments or other such boilerplate that might otherwise skew the statistics.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.