GithubHelp home page GithubHelp logo

rippy's Introduction

Rippy

./logo.png

Rip-it with Rippy

Introduction

Rippy is a downloader designed to scrape websites using a real web browser to find, e.g. video or downloadable files. The targets are website that try to be scrape-resistant and where other downloaders had to give up.

The magic is that Rippy uses a real browser it controls so a lot of the normal anti-bot designs are inefficient, e.g. scrambling javascript. To block Rippy you will have to block browsers. I also enjoy a blocking arms-race, keeps my day bright and fulfilled.

Installation

Currently the only distribution method officially provided is the docker-compose way but all it really requires is Chrome and Python.

wget https://github.com/JohnDoee/rippy-docker/raw/master/docker-compose.yml

You should edit docker-compose.yml. The following values should be changed

  • /tmp/media should be changed to where you want rippy to download data, it is in the file twice.
  • BASIC_AUTH_PASSWORD should be changed to a unique password
  • SECRET_KEY should be changed to something unique
  • Optional: Change RIPPY_CONCURRENCY to how many scrape and download threads you want to have.
docker-compose up -d

Usage

Head over to http://ip:51359 and add a job. It should start downloading or prompt you to do something manually.

If the status text says “Waiting” it means you need to open the browser and fill in a captcha or something alike. If you are using the docker-compose setup there should be a button in the upper-right corner of the website to open the browser. It will open a new window with a VNC to the hosted Chromium browser.

New scrapers

Feel free to request a new scraper but there are a few requirements if you want me to implement them: They are scrape resistant, as in, nobody else should be able to download. Check out tools like youtube-dl and JDownloader first. They should not be using an encryption or behind paywall, i.e. I can’t do stuff like netflix (something like that is also not the target at all)

Currently a generic video-site scraper is on the slab as this project is a merge between a reddit post and a generic video-site scraper

Accompanied repositories

Docker-compose file and docker chromium repository

Rippy webinterface

FAQ

Q:My tab crashed or elements on the website crashed, what should I do?
A:Close the tab, rippy should notice it shortly and try again.

TODO

  • [ ] Add (semi-)generic view player extractor
  • [ ] Return (potentially proxied) URL to video instead of downloading

Supported sites

  • Avgle

Docker images

Main backend component (this repository)

Webapp and reverse proxy

Chrome accessible via VNC

Logo / icon

frog by habione 404 from the Noun Project

License

MIT

rippy's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

fakegit

rippy's Issues

Downloading is slow

Nice project! I wonder if we could boost the downloading speed. Maybe using multiple threads or tools like ariac2?

Failed - Failed FFMpeg with returncode 1

Add a retry button

image

Currently the UI doesn't even allow me to copy the full url. Would love to have a retry button :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.