GithubHelp home page GithubHelp logo

vik407 / hipposcraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kokopi-dev/hipposcraper

1.0 1.0 0.0 154 KB

A Linux terminal tool for parsing and scraping Holberton project pages to automate repetitive tasks.

Python 90.37% Shell 9.63%

hipposcraper's Introduction

github version

Hipposcraper - Python Scripts for Automating Holberton Projects

[STATUS] This repo is no longer maintained by Derrick Gee and Brennan D Baraban starting 6/22/2019, please ask around or on the Holberton Slack to find someone who is maintaining a fork of this repo if you are looking for an updated version of this scraper.

The Hipposcraper automates file template creation for Holberton projects. The program takes a link to a Holberton School project, scrapes the webpage, and creates the corresponding directory and files. The Hipposcraper currently supports the following:

System Engineering Low-Level Programming Higher-Level Programming
Bash script templates .c templates .py and .c templates
Header file Header file
_putchar file
main.c test files main.c/main.py test files
README.md README.md README.md

Getting Started ๐Ÿ”ง

IMPORTANT: Make sure your version is up to date (at the top of the readme), running hippoproject or hipporead will display the version.

Follow these instructions to set up the Hipposcraper on your machine.

Prerequisites

The Hipposcraper relies on the Python packages Mechanize and BeautifulSoup4. Installation of these packages requires pip. If you are on a Debian-based Linux distribution:

sudo apt-get install pip

Once pip has been installed, install Mechanize and BeautifulSoup4 as follows:

pip install mechanize
pip install beautifulsoup4

Note that you may need to run the --user option when installing these packages.

Setup ๐Ÿ”‘

Setting User Information

After cloning a local copy of the repository, enter your Holberton intranet username and password as well as your GitHub name, username, and profile link in the auth_data.json file.

  • Using setup.sh: Run ./setup.sh to automatically setup the required information

Setting Aliases

The Hipposcraper defines two separate Python scripts - one (hippoproject.py) that creates projects, and a second (hipporead.py) that creates README.md files. To run both simultaneously, you'll need to define an alias to the script hipposcrape.sh.

First, open the script and enter the full pathname to the Hipposcraper directory where directed. Then, if you work in a Bash shell, define the following in your .bashrc:

alias hipposcrape='./ENTER_FULL_PATHNAME_TO_SCRAPER_DIRECTORY_HERE/hipposcrape.sh'

Alternatievely, you can define separate aliases for each individual script. To define a project scraper alias:

alias hippoproject='./ENTER_FULL_PATHNAME_TO_SCRAPER_DIRECTORY_HERE/hipposcraper.py'

And to define a README.md scraper alias:

alias hipporead='./ENTER_FULL_PATHNAME_TO_SCRAPER_DIRECTORY_HERE/hipporead.py'

NOTE: This program only works with Python 2; ensure that your aliases specify 'python2' (Mechanize is not supported by Python 3).


Usage ๐Ÿ’ป

After you have setup the proper aliases, you can run the Hipposcraper with the following command:

~$ hipposcrape project_link

Where project_link is the URL link to the Holberton School project to scrape.

Alternatively, to run only the project scraper:

~$ hippoproject project_link

Or only the README.md scraper:

~$ hipporead project_link

check.sh - Generated for checking formats on all required files

~$ ./check.sh

Repository Contents ๐Ÿ“

  • hipposcraper.sh

    • A Bash script for running the entire Hipposcraper at once.
  • hippoproject.py

    • Python script that scrapes Holberton intranet webpage to create project directories.
  • hipporead.py

    • Python script that scrapes Holberton intranet webpage to create project README.md.
  • auth_data.json

    • Stores user Holberton intranet and GitHub profile information.
  • scrapers

    • Folder of file-creation scrapers.
      • base_parse.py: Python script for parsing project pages.
      • sys_scraper.py: Python methods for creating Bash task files for system engineering projects.
      • low_scraper.py: Python methods for creating _putchar.c, task files, and header file for low-level programming projects.
      • high_scraper.py: Python methods for creating Python task files for higher-level programming projects.
      • test_file_scraper.py: Python methods for creating test files for all project types.
  • setup.sh: Sets up all variables and aliases with this script.

  • autover.sh: Development tool for changing all version strings.


Example of the C scraper

demo0

Example of the README scraper

demo1

Example of check.sh

demo2


Author


Contributors

hipposcraper's People

Contributors

234761 avatar bdbaraban avatar kokopi-dev avatar narnat avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.