Projects.co.id is an online Indonesian freelancing platform. It helps project owners to find talent to help them with their project. For talents, browsing the website for suitable projects is time consuming. Therefore, this Python script help to scrape all available projects along with detailed information in the website.
Script user must abide to rules of the website, and use the data for good intention only. And not to perform too many request to the server.
- Python 3.6 or above
- Scrapy
-
Download repository
git clone https://github.com/ceedadev/projects.co.id-scraper.git cd projects.co.id-scraper
-
(Optional) Install Python Virtual Environment
python3 -m venv venv
- Mac / Linux
source ./bin/activate
- Windows Powershell
.\venv\Scripts\Activate.ps1
-
Install requirements
pip install -r requirements.txt
-
Run Spider
- Output CSV
scrapy crawl projects -O projects.csv
- or Output JSON
scrapy crawl projects -O projects.json
- Implement Scrapy
- ScrapyRT for API
- Item Pipelines to SQL DB
- Perform tracking of projects
- SMPT Service for new and tracked project tags