GithubHelp home page GithubHelp logo

2024-mgr-sluzba-cywilna's People

Contributors

konrad-wasiak avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

2024-mgr-sluzba-cywilna's Issues

Nabory

Zadania:

  • jak wygląda procedura naboru

Kod z R

Moj kod R do wczytania i przetworzenia danych

library(RcppSimdJson)
library(data.table)
library(stringi) ## stringr
library(lubridate)

file <- "../data-raw/kprm-with-salary-update-20231115.json"

lines <- readLines(file)

lines |> 
  lapply(fparse) |>
  lapply(function(l) lapply(l, function(x) trimws(paste(x, collapse = "|-|")))) |>
  lapply(unlist) |>
  do.call("rbind", args = _) |>
  data.frame() |>
  setNames(c("job_id", "job_title", "institution", "city",
             "address", "work_place1", "work_place2", "work_place2", 
             "date_annouced", "date_result", "date_valid",
             "result", "salary", "vacancies", "work_time", "remove", "responsibilities",
             "education", "requirements1", "requirements2", "additional_corresp", "views")) |>
  setDT() -> lines_df

lines_df[, work_place2:=NULL]
lines_df[, remove:=NULL]

lines_df[, job_id:= stri_extract_first_regex(job_id, "\\d{1,}")] ## usuwam NR z id
lines_df[, views:= as.numeric(stri_extract_first_regex(views, "\\d{1,}"))] ## 
lines_df[, address:=stri_trim_both(stri_replace_first_fixed(address, "Adres urzędu:", ""))]
lines_df[, work_place1:=stri_trim_both(stri_replace_first_fixed(work_place1, "Miejsce wykonywania pracy:", ""))]
lines_df[, result:=stri_trim_both(stri_replace_first_fixed(result, "Wyniki naboru:", ""))]
lines_df[, education:=stri_trim_both(stri_replace_first_fixed(education, "Wykształcenie:", ""))]

lines_df[, date_annouced:=dmy(date_annouced)]
lines_df[, date_result:=as.Date(stri_datetime_parse(date_result, "date_long", locale='pl_PL'))]
lines_df[, date_valid:=as.Date(stri_datetime_parse(date_valid, "date_long", locale='pl_PL'))]
lines_df[, date_documents:=dmy(stri_extract_first_regex(additional_corresp, "\\d{2}\\.\\d{2}\\.\\d{4}"))]

lines_df[, vacancies:=as.numeric(vacancies)]

lines_df[stri_detect_regex(result, "^anulowano nabór"), result:= stri_replace_first_fixed(result, "anulowano nabór", replacement = "anulowano nabór|")]
lines_df[stri_detect_regex(result, "^anulowano nabór"), result := stri_replace_last_fixed(result, "|", "")]
lines_df[!stri_detect_regex(result, "^anulowano nabór") & !stri_detect_fixed(result, "|"), result := stri_replace_first_fixed(result, "kandydata", "kandydata|")]
lines_df[, c("result1", "result2") := tstrsplit(result, "|", fixed = T)]

fwrite(
  x = lines_df[, .(job_id, job_title, institution, city, address, work_place1, work_place2, 
                   date_annouced, date_result, date_valid, date_documents, result1, result2, 
                   salary, vacancies, work_time, responsibilities, education, requirements1, requirements2, views)],
  quote = TRUE,
  sep = ";",
  file = "../data/kprm-20231115.csv.gz")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.