GithubHelp home page GithubHelp logo

find-my-job's Introduction

Find My Future Job

A web scraping project (written in Python) to find job announcements on certain platforms, like LinkedIn, according to user specified keywords (as search query terms) as such:

  • Discipline / studied field of expertise
  • Industry
  • Demanded skills

Can also be used to extract data related to the search, like:

  • Most Requested skills in the job announcements that the user's search query yields
  • Most job announcements per region.

Examples

This section lists some examples for possible uses of the package

Example 1 - Searching for Aerospace Engineering Jobs

aerospace engineering cfd fem aerodynamics optimization | optimisation

Example 2 - Searching for Data Scientist Jobs

programming modelling statistics database spreadsheet visualization business intelligence

Example 3 - Searching for Most Requested Skills for ML Engineer

mlops excel aws azure sas powerbi tableau excel spreadsheet sql scala jira apache spark hadoop r python javascript mssql postgresql mysql nosql datastax cassandra mongodb ETL linux devops git github

Technical Aspects

Method

Here goes how the functions work

Assumptions

Here goes the assumptions with the method

The JobId obtained from the url of individual job postings are unique and can be used as a primary key to access that particular job posting.

Developer Roadmap

For the minimum viable product:

  • Decide on which job-searching website(s), this tool should be compatible with

    • Inspect such job web-pages' source codes and check whether the webpages are static or dynamic after the search query response is generated.

    • Accordingly, establish a web-scraping method, i.e. decide whether to use HTTP-Requests or Selenium with Webdriver to scrape data from the website, according to the static/dynamic behavior of the website.

    • Check if the job search can be accessed without any means of authentication. If not, create a burner account for testing purposes in the website.

    • Check if there are any designated API endpoints to obtain job search query results.

      • From a quick search, it seems like LinkedIn previously had such a functionality, but has been deliberately deprecated to avoid competition.
  • Decide on how the user will be using the tool

    • Input:
      • Maybe with a configuration.ini file
      • Maybe with a JSON object
      • Will it be a tool to be executed from the command line?
    • Output:
      • Which details are to be recorded?
        • If used for job announcement matching, it should at least record ANNOUNCEMENT_ID, ANNOUNCEMENT_URL, SCORE-or-PRIORITY
        • If used for inspecting most demanded skills, it should at least record SKILL_IDENTIFIER-or-_NAME, NUMBER_OF_OCCURANCES-or-SCORE
      • Decide on the way(s) to record the results from scraping data from the web.
        • Possibly in a .csv file as backup
        • Into a Database with some prior "cleaning"
  • Implement asynchronous calls to web-scraping functions

    • There might be CAPTCHA tests if server thinks a bot is executing repeated consecutive commands, therefore it may require some testing to find ideal timing for not overloading the interface.
  • Create modules for different classes and a collection for helper functions

  • [ ]

Developer Notes

:TODO:

:REVIEW: May require delving into fuzzy string searches,

find-my-job's People

Contributors

b-krtls avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.