GithubHelp home page GithubHelp logo

site-graph's Introduction

Welcome to my GitHub ๐Ÿ‘‹

I write code for my research and sometimes for other things.

My research repositories

My project repositories

site-graph's People

Contributors

ada-armstrong avatar machawk1 avatar matkoniecz avatar nathan-artist avatar tomlinsonk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

site-graph's Issues

Consider adding a license

See https://choosealicense.com/ for tl;dr and to confirmation that it is a real problem

Currently no license is mentioned anywhere, what makes this code fully copyrighted. This is a default for any creative work - what includes software.

Please, please add a license. The fact none is listed makes using this software a legal quagmire.

For start - people are not allowed to make pull requests with improvements as it would require distribution of modified code - that is fully copyrighted.

For example currently it is not legal to use, improve or distribute this code. Hopefully this is not the intended effect, given that code was open sourced.

In case that for some reason not releasing it under open license is intentional - mentioning it in the readme also would be a good idea. Though note that given that it is open sourced it will not stop anyone not caring about copyrights.

Popular licences include:

  • MIT license that allows people to use code for any purpose
  • GNU GPLv3 that intends to allow everything but to block people from distributing closed source versions that build on open source project
  • GNU AGPLv3 that in addition requires providing source code when providing software as a service

Feel free to ask me questions :) I know and understand that people often prefer to ignore copyright law, unfortunately it will not make it disappear.

I am posting this as this code fits almost exactly what I need and it would be sad to write my own one from scratch.

Errors with accents or emojis in url?

Urls like https://example.com/tags/๐ŸŒณarbrรฉ are valid in a browser. It would be interpreted like this https://example.com/tags/%F0%9F%8C%B3arbr%C3%A9/

It seems like the script does not do the interpretation work and throws error instead.
Are you aware of this? Is it fixable?

supporting local files

it be OK to make PR adding support for reading local files?

It would be useful for me as I am trying to make/find a decent tool for detecting broken links.

And possible additional feature: detect files that are within folder but are not linked (orphaned html file, not linked from any part of the website and/or unused media files)

(would be happy to make PR if that would be accepted and #3 is resoved)

Consider editing repository setings to remove "Packages" section

"Packages No packages published" is displayed right now, fortunately this pointless section can be removed.

Edit repo page config to remove it (cog next to the description).

I am not making a PR as it is defined in proprietary github settings, not in a git repository - and I have no rights to modify repo settings.

Maybe also releases section should be removed?

Peek 2020-10-25 09-10

consider changing color for error

red nodes are external pages, and yellow nodes are pages with errors

Maybe it would make more sense to use black/red/eyesore red/magenta for errors?

Red for errors is quite typical, maybe color like

screen04

would be even better?

(would be happy to make PR if that would be accepted and #3 is resoved)

ModuleNotFoundError: No module named 'bs4'

Hello,
by executing the installation and configuration as described:

git clone https://github.com/tomlinsonk/site-graph.git
cd site-graph
pip3 install -r requirements.txt

everything seems working fine.
Then, trying to crawl the website from inside the site_graph folder

python3 site_graph.py https://marcoXbresciani.codeberg.page

I see this error:

Traceback (most recent call last):
  File "C:\Users\marco.bresciani\Chiavetta\Software\site-graph\site_graph.py", line 2, in <module>
    from bs4 import BeautifulSoup
ModuleNotFoundError: No module named 'bs4'

I've tried by pip3 install bs4 but no luck.

  • Python 3.9.12
  • pip 23.2.1 from C:\Python39\lib\site-packages\pip (python 3.9)

joomla

Hello Dear Sir!

I am trying to use you software (its prety cool by the way) but it seems to have a problem with joomla where the links containt article id-s and all subpage is basicly baseurl/index.php...id=654. what can i change or do to make it visualize my entire site including this type of links?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.