GithubHelp home page GithubHelp logo

labourr's People

Contributors

alekoure avatar peterthunder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

labourr's Issues

Using custom dictionaries in labourR package

I have a feature request: it will be great if the package enabled user to supply his/her own dictionary of names connected with each ISCO occupation.

This will be useful while applying package to survey data. On the one hand respondents' answers to questions regarding occupations are somewhat different from official names of occupations, so specific (survey) dictionary may be useful to improve quality of codding. On the other hand there is a lot of data from previous research that contain human coded mapping of respndents' answers to ISCO codes, that may be used as dictionaries.

And besides I'd like to thank you for this very useful package!

tfidf matches

Hi there -

Really great to see an R package for converting occupation descriptions to ISCO-08 codes.

Could you describe the process of making the tfidf_tokens dataset though. Does it just use the description field?

Also, I think the matching algorithm can be improved not just by taking the sum of the tfidf scores as it does not penalise for when a term is not in the matched tfidf score.

For example,

'bus driver' returns (num_leaves = 10) best match as:

  • 8332 Heavy truck and lorry drivers

The weightTokens match for this is:

Screenshot 2022-11-18 at 13 48 53

Whereas 8331 Bus and tram drivers is in the occupations dataset. But the weightTokens are:

Screenshot 2022-11-18 at 13 48 40

Therefore 8332 Heavy truck and lorry drivers is not being penalised for not having 'bus' in it.

I will have a see if the matcher can add a penalty to it if all words aren't in the weighTokens.

Release labourR 1.0.0

Prepare for release:

  • Check that description is informative
  • Check licensing of included files
  • usethis::use_cran_comments()
  • devtools::check()
  • devtools::check_win_devel()
  • devtools::check(remote = TRUE, manual = TRUE)
  • rhub::check_with_sanitizers()
  • rhub::check_for_cran()
  • Polish pkgdown reference index
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('major')
  • Update cran-comments.md
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • usethis::use_news_md()
  • Update install instructions in README
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

language detection

Map identify_language() with existing ESCO languages and give a list.

concordances

Hi - any chance to offer concordance functions with NOC?

great stuff!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.