GithubHelp home page GithubHelp logo

code4days / lookyloo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lookyloo/lookyloo

0.0 2.0 0.0 499 KB

Lookyloo is a web interface allowing to scrape a website and then displays a tree of domains calling each other.

License: BSD 3-Clause "New" or "Revised" License

Dockerfile 2.27% Python 31.61% CSS 3.16% JavaScript 51.06% HTML 11.90%

lookyloo's Introduction

Lookyloo icon

Lookyloo is a web interface allowing to scrape a website and then displays a tree of domains calling each other.

What is that name?!

1. People who just come to look.
2. People who go out of their way to look at people or something often causing crowds and more disruption.
3. People who enjoy staring at watching other peoples misfortune. Oftentimes car onlookers to car accidents.
Same as Looky Lou; often spelled as Looky-loo (hyphen) or lookylou
In L.A. usually the lookyloo's cause more accidents by not paying full attention to what is ahead of them.

Source: Urban Dictionary

Screenshot

Screenshot of Lookyloo

Implementation details

This code is very heavily inspired by webplugin and adapted to use flask as backend.

Installation of har2tree

The core dependency is ETE Toolkit, which you can install following the guide on the official website

Note: all the PyQt4 dependencies are optional.

Installation of scrapysplashwrapper

You need a running splash instance, preferably on docker

sudo apt install docker.io
sudo docker pull scrapinghub/splash
sudo docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash --disable-ui --disable-lua
# On a server with a decent abount of RAM, you may want to run it this way:
# sudo docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash --disable-ui -s 100 --disable-lua -m 50000

Installation of the whole thing

pip install -r requirements.txt
pip install -e .
wget https://d3js.org/d3.v5.min.js -O lookyloo/static/d3.v5.min.js
wget https://cdn.rawgit.com/eligrey/FileSaver.js/5733e40e5af936eb3f48554cf6a8a7075d71d18a/FileSaver.js -O lookyloo/static/FileSaver.js

Run the app locally

export FLASK_APP=lookyloo
flask run

With a reverse proxy (Nginx)

pip install uwsgi

Config files

You have to configure the two following files:

  • etc/nginx/sites-available/lookyloo
  • etc/systemd/system/lookyloo.service

And copy them to the appropriate directories and run the following command:

sudo ln -s /etc/nginx/sites-available/lookyloo /etc/nginx/sites-enabled

If needed, remove the default site

sudo rm /etc/nginx/sites-enabled/default

Make sure everything is working:

sudo systemctl start lookyloo
sudo systemctl enable lookyloo
sudo nginx -t
# If it is cool:
sudo service nginx restart

And you can open http://<IP-or-domain>/

Now, you should configure TLS (let's encrypt and so on)

Run the app with Docker

Dockerfile

The repository includes a Dockerfile for building a containerized instance of the app.

Lookyloo stores the scraped data in /lookyloo/scraped. If you want to persist the scraped data between runs it is sufficient to define a volume for this directory.

Running a complete setup with Docker Compose

Additionally you can start a complete setup, including the necessary Docker instance of splashy, by using Docker Compose and the included service definition in docker-compose.yml by running

docker-compose up

After building and startup is complete lookyloo should be available at http://localhost:5000/

If you want to persist the data between different runs uncomment the "volumes" definition in the last two lines of docker-compose.yml and define a data storage directory in your Docker host system there.

lookyloo's People

Contributors

rafiot avatar sw-mschaefer avatar steveclement avatar

Watchers

James Cloos avatar Rasheed Elsaleh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.