This crawler tasked with the partnership classification process for thousands of companies. It takes a CSV file containing organizations' domains and create a new CSV file that includes a list of partners domain for each organization.
The web crawler gets from the user one arguments: An input CSV file containes:
- Organization Name - The name of the organization that the crawler will crawl.
- Website - The domain of 'Organization Name.'
An example of an input file called 'Input.csv' locates at this repo).
The crawler outputs a CSV file that contains two columns:
- Organization Web Page - The domain of the website that has been crawled.
- Partners Web Page - domains of the partner's companies of the appropriate organization.
The output file will be generated/saved in the same directory the app running at.
- This script was created by me in Summer 2019.
- It was created as part of a summer internship at EVERTHERE, and I was guided by the CTO & Co-Founder Gabriel Amram and Lead Architect Sofi Vasserman.
- Thank you EVERTHER for letting me this opportunity. It was a great experience to learn from you guys!