GithubHelp home page GithubHelp logo

matiskay / html-cluster Goto Github PK

View Code? Open in Web Editor NEW
19.0 19.0 2.0 548 KB

A command line tool to cluster html pages based on structural and style similarity.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
cluster html python36

html-cluster's People

Contributors

matiskay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

html-cluster's Issues

Make this repository work with more recent versions of python

I am just a Master's degree student, but I was searching for a library to cluster html pages and this was the first
result that I encountered.

I managed to use your commands after downloading your repository and installing it with 'pip install .'
However I had a strange error: 'no module named "html_cluster"', apparently the folder html_cluster
is not recognized as a module, so I modified the files with sys.path.append(html_cluster_path) to make
everything work again in the 'html_cluster' file of the bin folder of my conda environment and in the files of the commands
(download_html.py)

I really is a minor thing, thus I think that it should be possible for you to give it a try, maybe you can fix it quickly and re-release this package for more recent versions of python. I am using python 3.7 on a remote server running linux,
but from one of the labels on the repository I can see that you used python 3.6 to make this

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.