GithubHelp home page GithubHelp logo

ropensci-archive / refimpact Goto Github PK

View Code? Open in Web Editor NEW
5.0 4.0 0.0 4.34 MB

:no_entry: ARCHIVED :no_entry: API Wrapper for the UK REF 2014 Impact Case Studies Database

License: Other

R 100.00%
uk research-funding research-improvement directed-graph directed-graphs text-mining research-policy r rstats r-package

refimpact's Introduction

refimpact

Project Status: Abandoned

This repository has been archived. The former README is now in README-NOT.md.

refimpact's People

Contributors

karthik avatar maelle avatar perrystephenson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

refimpact's Issues

Fails to build

The CI fails to build the package:

* checking for file 'refimpact/DESCRIPTION' ... OK
* preparing 'refimpact':
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
--- re-building 'refimpact.Rmd' using rmarkdown
refimpact: API wrapper for the UK REF 2014 Impact Case Studies Database.
Run ?refimpact for help, or see the vignette.
Quitting from lines 71-74 (refimpact.Rmd) 
Error: processing vignette 'refimpact.Rmd' failed with diagnostics:
incorrect number of dimensions
--- failed re-building 'refimpact.Rmd'

SUMMARY: processing the following file failed:
  'refimpact.Rmd'

Consider object_verb function naming pattern

This is minor, but the naming could follow the object_verb naming guidelines suggested in rOpenSci Packaging Guide, possibly: casestudies_get, institutions_get, etc., or even ref_casestudies_get, ref_institutions_get, ... to make it clear that these are all from this package. This makes it more like the stringi example in the packaging guidelines. But for such a short package this is really a matter of preference than any sort of requirement. (But since rOpenSci is designed to keep the R jungle tidy, why not start with even the smallest vines.)

Address release/NEWS.md mismatch

NEWS.md: This file has no details besides declaring the "initial release", although there are two tagged releases in the GitHub account. Could link the release announcement in NEWS.md to the v1.0 release tag.

Review function and package documentation

This is where the package is weakest, since the documentation is limited to the barest of documentation for the functions. Even those do not describe much about the context of the options or inputs: for instance get_case_studies() did not really explain what the ID is or why I might use it instead of a UKPRN. I suggest that this could be vastly expanded in a Details section and added as an overview to refimpact-package.Rd. I had to do considerable testing and reading from the UK REF website before I understood what were the inputs to the functions.

Extend unit tests

For instance, the single test in test_tag_types.R is:

test_that("get_tag_types() returns a tibble", {
  # skip_on_cran()
  expect_equal(dim(get_tag_types()),c(13,2))
})

which not only does not test whether the return is a tibble, but also only matches the dimensions known in advance. A more robust test might also compare the values returned to the known tags from the REF website, or that the return type is in fact a tibble class object.

Think about whether hybrid data/API makes sense

As someone who was actually an individual submitted in this exercise -- although not for a case study -- from one of the listed institutions, and personally involved in managing the staff submitted from my institution, I held the first impression that this would be a really cool package to have, for accessing this data. However the more I experimented with the package, the more I wondered why the API approach was needed. The data is static, so the only reasons not to package the data are copyright and size. Most of the information (except some of the case studies - see http://impact.ref.ac.uk/CaseStudies/Terms.aspx) is governed by a CC license, and so could easily be packaged as data. The only objection to size applies to the case studies themselves, but again, if the documentation or README.md had more on the motivation and/or documentation, I would have a better idea of just how large this is (and whether this size makes it something that is not better simply provided as a flattened large data.frame or "tibble").

The following static tables from the API are CC licensed and could easily be packaged as built-in objects:

institutions: This table is 155 x 5 data.frame of 20.8k in size
units_of_assessment: 36 x 3
tag_types: 13 x 2
values: This is much larger but the entire table could be flattened in a way that links to tag_types, if we are willing to strongly suspend the principles of relational data normalization (something most users may not know or care about).
This seems to gut the functions from the package, since it leaves only get_case_studies(), which might be appropriately handled through an API call. But here I suggest the package could really enhance value by adding data-handling functions that link the static data objects to the structure of what get_case_studies() returns, such as ways to flatten the lists that are elements of the return objects from that function. For instance, the return object from get_case_studies(ID = c(27,29)) is a 2 x 19 element tibble, but several of those columns (e.g. Continient) are variable length lists. Many users who are not experts in dissecting R objects are going to have trouble with the nesting of lists within data.frames.

In addition, by having the smaller objects as built-in data, the inputs to get_case_studies() can be checked for valid values, rather than relying on the API to reject a non-existent ID, for instance.

Appveyor builds and webhooks

๐Ÿ‘‹ @perrystephenson!

I'm currently looking for Appveyor "broken" hooks over all ropensci repos. refimpact's one is broken see https://github.com/ropensci/refimpact/settings/hooks/10055678 the latest deliveries failed so your latest commits weren't built on Appveyor.

I see builds are under your account which is fine. You'll need to contact Appveyor at [email protected] for asking them (how) to fix the webhook. I did that for other repos, they'll want to know this about the latest delivery:

  • X-GitHub-Delivery: 7b312b20-7266-11e7-8a00-03912236c505

  • Time 2017-07-27 02:56:53

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.