GithubHelp home page GithubHelp logo

benabbes-slimane-takiedine / linkedin-scraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from hakimkhalafi/linkedin-scraper

0.0 0.0 0.0 97 KB

Module for scraping LinkedIn profile contents search

License: MIT License

Python 100.00%

linkedin-scraper's Introduction

linkedin-scraper

This tool allows you to scrape LinkedIn profiles based on search queries. You can use the end result for NLP/text analysis.

This repo is a heavily modified version of dchrastil's ScrapedIn with the addition of in-profile scraping.

Prerequisites

You will need python 3+.

First clone this repo, then navigate to the folder and install the requirements

git clone https://github.com/hakimkhalafi/linkedin-scraper.git
cd linkedin-scraper
pip install -r requirements.txt

Configuring

There are 4 config settings you must change before the tool runs successfully.

In config.json, change the following values to match your LinkedIn sign-on info

"username": "[email protected]",
"password": "password",

Additionally, you will need to extract some cookie settings in order to login successfully

"li_at": "Aexampleexampleexamplee........",
"Csrf-Token": "ajax:1234567890123456789",

You can do this in chrome by logging in to LinkedIn -> right click page -> "Inspect" -> "Application" tab -> "Cookies".

Getting cookie config settings

Then double click the relevant values marked in red. "JSESSIONID" goes into "Csrf-Token" and "li_at" into "li_at".

Running

Once you're set up and configured, you can run the tool via

python run.py -s "search query"

Where search query can be any job title such as "Data Scientist" etc.

The end result will be a CSV containing the following information

Column name Content
person_id Identifier for profile
fs_profile Main profile information
fs_position All information for all job positions listed
fs_education All information for all attained education
fs_language Any languages the person speaks
fs_skill Any skills the person has provided
fs_project Any projects the person has completed
fs_honor Any activities and honors the person has
fs_publication Any publications the person has published
fs_course Courses the person has completed

Enjoy!

Disclaimer

This educational tool probably violates LinkedIns terms of service. Use at your own risk.

linkedin-scraper's People

Contributors

dependabot[bot] avatar hakimkhalafi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.