GithubHelp home page GithubHelp logo

redcapapi's Introduction

DOI License: GPL v2 r-cmd-check.yml

redcapAPI

redcapAPI is an R package to pull data from a REDCap project. Its design goes far beyond a 'thin' client which just exposes the raw REDCap API into R. One principal goal is to get data into memory using base R in a format that is analysis ready with a minimum of function calls. There are over 7,000 institutions and 3 million users of REDCap worldwide collecting data. Analysis in R for monitoring and reporting that data is a common concern for these projects.

Core concerns handled by the library:

  • API_KEY (which is equivalent of username/password to ones data!) secure handling practices are designed to be as seamless as possible via unlockREDCap. There are override methods available for production environments.
  • Retry strategy with exponential back off. When a REDCap server or a network is overloaded requests can fail. Each call to the API will retry multiple times, and it doubles the wait time between each call. This dramatically increases the odds of success for a script with multiple API calls to REDCap.
  • Automatically handles and caches meta data information needed to understand and translate a project's data.
  • A robust type casting strategy that every step of the process can be overridden by the user via inversion of control. The strategy proceeds as follows:
    • NA detection per REDCap (or user!) definition of NA.
    • Validation of data versus the target type/class. reviewInvalidRecords provides a summary report of all data that fails validation, with hot links to the record in question. This is an important step. Data that does not match the target format cannot be cast, e.g. "xyz" cannot be treated as a numeric and will become NA in the final data set.
    • Final type casting to target type.
  • Sparse block matrix splitting into forms/instruments with filtering of empty rows.
  • Additional helper functions, e.g. longitudinal wider/long conversions, guessing if a character field is actually a date, and SAS exports.
  • Importing data reuses a lot of the casting functions in reverse to ensure data integrity both directions.

Comparison to Other REDCap Packages

Feature redcapAPI REDCapR REDCapExporter tidyREDCap REDCapTidieR REDCapDM
CRAN Downloads
Export Data To R
Import Data From R
Sparse Block Splitting
Field Labeling
Attribute Processing
Logical Expression Query partial
Tidy/Tibble Support
Data Summary
Type Conversion Callbacks
API Failure Auto-Retry
Secure API Key Storage
Validation Reporting
Extensive Test Suite
Logfile Processing
Offline Calculated Fields

Quick Start Guide

There are 2 basic functions that are key to understanding the core approach:

  • unlockREDCap
  • exportBulkRecords

Here's a typical call for these two:

library(redcapAPI)

# IMPORTANT: Put the following line in .Rprofile `usethis::edit_r_profile()`
options(keyring_backend=keyring::backend_file)

unlockREDCap(c(rcon    = '<MY PROJECT NAME>'),
             keyring     = 'API_KEYs',
             envir       = globalenv(),
             url         = 'https://<REDCAP_URL>/api/')
exportBulkRecords(list(db = rcon),
  forms = list(db = unique(rcon$metadata()$form_name)),
  envir = globalenv())

The <MY PROJECT NAME> is a reference for whatever you wish to call this REDCap project. The rcon is the variable you wish to assign it too. The keyring is a name for this key ring. If one uses 'API_KEYs' for all your projects, you'll have one big keyring for all your API_KEYs locally encrypted. The url is the standard url for the api at your institution. The envir call is where to write the connection object; if not specified the call will return a list.

The next call to exportBulkRecords, says to export by form and leave out records not filled out and columns not part of a form. The first argument is specifying a db reference to the connection opened and naming it the same thing. The second call is saying for this connection export back the all the forms/instruments present in that db, if this is left blank it defaults to all forms/instruments. The envir has it writing it back to the global environment as variables. Any parameter not recognized is passed to the exportRecordsTyped call--for every REDCap database connection. For most analysis projects the function exportBulkRecords provides the functionality required to get the data in memory, converted, type cast and sparse block matrix split into forms/instruments with blank rows filtered out.

These two calls will handle most analysis requests. To truly understand all these changes see: vignette("redcapAPI-best-practices").

Version 2.7.0+

2.7.0 introduced exportRecordsTyped which is a major move forward for the package. It replaces exportRecords with a far more stable and dependable call. It includes retries with exponential back off through the connection object. It has inversion of control over casting, and has a useful validation report attached when things fail. It is worth the time to convert calls to exportRecords to exportRecordsTyped and begin using this new routine. It is planned that in the next year exportRecords will be removed from the package.

Community Guidelines

This package exists to serve the research community and would not exist without community support. We are interested in volunteers who would like to translate the documentation into other languages.

Contribute

If you wish to contribute new features to this software, we are open to pull requests. Before doing a lot of work, it would be best to open issue for discussion about your idea.

Coding Style Guideline Note

  • Exported function names: dromedaryCase
  • Internal function names: .dromedaryCase
  • Constant data exported: UPPERCASE
  • Function parameters: snake_case
  • Function variables: snake_case
    • (exception) data.frame variable: CamelCase

Report Issues or Problems

REDCap and it's API have a large number of options and choices, with such complexity the possibility of bugs increases as well. This is a checklist for troubleshooting exports.

  1. Does Rec <- exportRecordsTyped(rcon) give you a warning about data that failed validations? If so, what kind of content are you seeing from reviewInvalidRecords(Rec)?
  2. Did you see 'choice string does not appear to be formatted for choices' as an error? If so see Issue #344
  3. What is returned by exportRecordsTyped(rcon, validation = skip_validation, cast = raw_cast)? This is a completely raw export with no processing by the library.
  4. Do you have any project level missing data codes? rcon$projectInformation()$missing_data_codes
  5. Do you have a secondary id field defined? rcon$projectInformation()$secondary_unique_field. In earlier versions REDCap will report one even if it's been disabled later, if this column doesn't exist then the library is unable to properly handle exports as the definition of the unique key doesn't exist. If one is defined and the field doesn't exist, one will have to contact their REDCap administrator to get the project fixed.
  6. Is it an empty row filtering issue? Try the option filter_empty_rows=FALSE and see if that fixes it.
  7. Search known open and closed issues to see if it's already been reported. If an issue matches your problem, then feel free to post a "me too" message with the information from the next step. Feel free to reopen a closed issue if one matches.
  8. If these steps fail to diagnose the issue, open an issue on github.com and we are happy to assist you. Please include your version of R, RStudio and packageVersion('redcapAPI').

What does "Project contains invalid characters. Mapped to '□'." mean?

This means that the data/meta-data stored in the REDCap database contains improperly encoded characters. It is a problem with the REDCap project itself. The authors of this library do not know the root cause of the encoding issue, but suspect it was an earlier version of REDCap that did not handle encoding properly. This library is respecting the reported encoding type when loading into memory. All cases seen to date have the data encoded in ISO-8859-1 (the default when the HTTP header is missing charset) and the REDCap server treats all data as UTF-8. This improper coding can result in data loss via the GUI if records are updated. It is best to discuss with your institutions REDCap administrator how to repair this problem and such repairs are outside the scope of this library. This error message is to make one aware of this issue in their project. The library does the best it can when it encounters encoding issues.

Seek Support

If you need help or assistance in understanding how to approach a project or problem using the library, please open an issue. We use these questions to refine the documentation. Thus asking questions contributes to refinement of documentation.

Documentation

Your institutions installation of REDCap contains a lot of documentation for the general usage of REDCap. For general questions outside the scope of interfacing the API to R please refer to your institutions REDCap instance documentation.

The help pages for functions is fairly extensive. Try ?exportRecordsTyped or ?fieldValidationAndCasting for good starting points into the help pages.

All Vignettes

There are several vignettes with helpful information and examples to explore. These provide higher level views than can be provided in help pages.

  • redcapAPI-casting-data
  • redcapAPI-data-validation
  • redcapAPI-getting-started-connecting
  • redcapAPI-missing-data-detection
  • redcapAPI-best-practices
  • redcapAPI-offline-connection

Back Matter

NOTE: Ownership transfer of this package to VUMC Biostatistics is complete.

The research community owes a big thanks to Benjamin Nutter for his years of service keeping this package current.

This package was originally created by Jeffrey Horner.

The current package was developed under REDCap Version 14+. Institutions can be a little behind on updating REDCap and so some features of the API may not always work.

Issue Review Process for Pull Requests

Goals:

  • Reproducibility
  • Test Driven
  • Robust checking of user inputs

Rules:

  • Hotfixes or documentation changes can skip this process.
  • The majority author on a pull request must not be the approving reviewer.
  • Each commit should include the issue number as a link.
  • The approving reviewer should check the following:
    • Visual review of code in pull request.
    • Was the NEWS updated?
    • Were tests written?
    • Were user inputs checked?
    • Does the VERSION need bumped?
    • Was documentation properly updated?
    • Was roxygen2 run on the updated documentation?
    • Does R CMD CHECK pass? (reviewer should run)
    • Does the test suite pass with no warnings? (reviewer should run)

License

redcapAPI A rich API client for interfacing REDCap to R

Copyright (C) 2012 Jeffrey Horner, Vanderbilt University Medical Center

Copyright (C) 2013-2022 Benjamin Nutter

Copyright (C) 2023-2024 Benjamin Nutter, Shawn Garbett, Vanderbilt University Medical Center

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

redcapapi's People

Contributors

brianhigh avatar couthcommander avatar graywh avatar jeffreyhorner avatar johnson-bradley avatar jubilee2 avatar marcuslehr avatar niknakk avatar nutterb avatar obregos avatar paddytobias avatar pbchase avatar sophiajia avatar spgarbet avatar stevelane avatar tobadia avatar wibeasley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

redcapapi's Issues

Develop an importMetaData function

Provide the following features

  • column names match the API export metadata columns
  • enforce valid field names are lowercase letters and underscore characters
  • enforce valid form names are lowercase letters and underscore characters
  • enforce valid field types. Needs to be configurable in case new field types are added.
  • enforce select_choices format. Note, format is different for multiple choice and sliders
  • enforce select_choices only for (dropdown, radio, checkbox, sql, slider)
  • enforce text_valdiation type. Needs to be configurable. Different institutions may have different options
  • all elements between brackets in branching logic need to be either a unique_event_name or field_name

A separate utility to map the UI download column names to the API download column names could be useful

function call

importMetaData <- function(rcon, 
                           data, 
                           ..., 
                           field_types = REDCAP_FIELD_TYPES, # exported constant
                           validation_types = REDCAP_METADATA_VALIDATION_TYPE, # exported constant
                           config = list(), 
                           api_param = list()){...}

importRecords checkbox level misimport of '0'

(From Lynne) Here’s a bug in redcapAPI importRecords where checkbox levels with a coded value of 0 all get set to 1 – in other words, if I have myvar___0 = 0 , it gets imported as myvar___0 = 1

I think it’s here: https://github.com/vubiostat/redcapAPI/blob/main/R/validateImport_methods.R#L418

I think checkChoice is going to give you the coded values for all the levels in the variable – which, if 0 is among the levels, is going to include 0.

Then, when it sets:

x[x %in% checkChoice] <- 1

it’s going to set these universally to 1.

I guess the idea behind it is more like, if someone has myvar___2 = 2, validate_import_checkbox is making the assumption the intent was to “check” option 2.

Which is fine, except in the case of myvar___0 = 0

Type Conversion Report

As a side artifact of fixing #10 and #14 it should be feasible to produce a very clean report in Markdown for the user of any failed type conversions. How to return this report is an open question. Maybe used the "redcap_error_handler"?

Proposed format of returned report:

REDCap Project XYZ Type Conversion Failures

YYYY-MM-DD
REDCap version: x.y.z
redcapAPI version: x.y.z

  • Field name 1 (type)
    • row X1, record id Y1, value 'Z1'
    • row X2, record_id Y2, value 'Z2'
  • Field name 2 (type)
    • row X3, record_id Y3, value 'Z3'
    • ...
    • row Xn, record_id Yn, value 'Zn'
## REDCap Project XYZ Type Conversion Failures

YYYY-MM-DD
REDCap version: x.y.z
redcapAPI version: x.y.z

* Field name 1 (type)
  * row X1, record id Y1, value 'Z1'
  * row X2, record_id Y2, value 'Z2'
* Field name 2 (type)
  * row X3, record_id Y3, value 'Z3'
  * ...
  * row Xn, record_id Yn, value 'Zn'

Better warning messages

A user has reported the following warning messages and they would like to know the field names that are causing them.

Warning messages:
1: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  wrong number of fields in entry(ies) 4298, 124804
2: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  wrong number of fields in entry(ies) 4475, 9609, 18554
3: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  wrong number of fields in entry(ies) 23167, 23907, 24892, 29784, 70058, 70915, 88322, 90332, 95099, 102932
4: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  13 entries set to NA due to wrong number of fields
5: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  19 entries set to NA due to wrong number of fields
6: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  wrong number of fields in entry(ies) 23927, 23928, 70889, 135252, 135255
7: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) :
  wrong number of fields in entry(ies) 9865, 17464, 23926, 24910, 24911, 27861, 70889, 135255
8: In (function (nm, lab)  : Missing field for suffix arm_results
9: In (function (nm, lab)  : Missing field for suffix arm_results_esp
10: In (function (nm, lab)  : Missing field for suffix send_results___1

We need test cases added that create these warnings on export. Then capture the warnings somehow and modify with the field name causing them.

guessCast [post process records]

Need a post processing function that has a "threshold" parameter and takes an na function, a validation function, and a cast function.

Design

  1. Only examine "text" fields
  2. Determine the subset that is not NA and validates.
  3. If the subset ratio is equal to or higher than the threshold it will run the cast function and update the "invalid" attr with the validation failures.

Bonus function: guessDate which just has the normal defaults for date fields already specified.

Transfer all API methods to use makeApiCall

Whereas we know that REDCap is more strictly enforcing how arrays must be sent to the API and now that makeApiCall has the retry functionality built in, all API methods should make use of makeApiCall to make a more consistent and reliable experience for the user.

Known functions that need to be updated

  • deleteArms
  • deleteRecords
  • exportEvents
  • exportInstruments
  • exportMappings
  • exportNextRecordName
  • exportPdf
  • exportProjectInformation
  • exportReports
  • exportSurveyParticipants
  • exportVersion
  • importArms
  • importRecords

Error when importing records: variable doesn't exist in data dictionary

Hi,

I receive an error when I try to import using the redcap API library. The error is:

Error in importRecords(rcon, import_yac105, overwriteBehavior = c("normal"), :
1 assertions failed:

  • The variables ppt_status_rem___3 do not exist in the REDCap Data Dictionary

This variable does exist in my data dictionary though. Any suggestions or is there more information I can provide?

Best,
Manny Hurtado
Data Manager
Northwestern University - Feinberg School of Medicine

importFiles event argument error

With records to process, I’m back to import, and get this error:

Error in if (!event %in% events_list$unique_event_name) coll$push(paste0("'", : argument is of length zero

This is on importFiles.

The project doesn’t have longitudinal events, so didn’t specify that argument, which is why argument is of length 0.

Do I need event = NULL?

drop redcapFactor usage on export methods

Users are reporting that they strip redcapFactor from output columns.

Consider all export methods.

Question: does redcapFactor offer any benefit to the end user? If not can we just strip it altogether?

Internal ref: BSTATGEN-1096

Currently used code:

if(rmrcl)
    for(v in n) {
      x <- d[[v]]
      if(inherits(x, 'redcapFactor')) {
        class(x) <- setdiff(class(x), 'redcapFactor')
        attr(x, 'redcapLabels') <- attr(x, 'redcapLevels') <- NULL
        d[, (v) := x]
        }
    } 

Post-Process Function Design [post process records]

This ticket is to capture discussion about the design of of post-processing functions. There's some tricky design considerations about downstream flow of information. This is tied to #44, #82, #87. #88.

DECISION POINT: Is metadata/raw attached as an attribute to frame for post processing or is the rcon passed?
SETTLED: The connection object will be passed as a parameter to post process functions (if needed).

I think issue #82 discussion is the most instrumental in reaching the proper decision.

I've had this idea of the next step and it seems a lot of us are working on different pieces of this, so I wanted to bring the interface vision together. As usual this is just a working proposal and by no means decided.

Post-process round up

Pseudocode of interfaces

Recs <- exportRecordsTyped(rcon)  |>  # Retrieve the records
        widerRecords(rcon, "med") |>  # Wide form repeated instrument "med"
        guessDates()              |>  # Guess date columns and convert
        recastRecords(rcon, 1:10) |>  # Flip the factors on columns 1:10
        mchoice(rcon)                # Add the mChoice columns for Hmisc

Questions:

Should the meta data be attached to the Recs so the rcon doesn't need to be specified? Downside: this could be lost in later processing and potentially confuse a user. Each of these need the modified meta data, i.e. field_types and codebook. I don't fully understand the connection between Benjamin's comments in the ticket on flipFactor and how this would work in this context. These may modify the codebook or change the layout of fields. Thus, the associated metadata is increasingly misaligned with the data.frame. It's almost like we need meta-data that is only valid for the current data.frame and in general is diverging with each pass from the original. This indicates we will have to attach it as an attribute and it will have to be a custom object (non-exported).

I really like the fact that exportRecordsTyped has a defined purpose and boundary, to retrieve the records in a typed format. The post process pieces are all modifiers to the resulting data.frame. They will each take care to include the attributes of the data.frame forward.

Thus, it changes to something like this:

Post-process round up (without rcon)

Pseudocode of interfaces

``` > Recs <- exportRecordsTyped(rcon) |> # Retrieve the records widerRecords("med") |> # Wide form repeated instrument "med" guessDates() |> # Guess date columns and convert recastRecords(1:10) |> # Flip the factors on columns 1:10 mchoice() # Add the mChoice columns for Hmisc

str(attr(Recs, "metadata"))
List of 2
$ field_types: char "text" "text" "factor" ...
$ codebook: char NA NA "1, Yes | 2, No" ...

</del>

unlockREDCap function

Need: a crypto-locker function that builds connections from a crypto-locker storing API_KEYS.

95% of the code exists in the rccola package. Need it ported and working.

Example:

library(redcapAPI)

# Cuts down on password requests on MAC
options(keyring_backend=keyring::backend_file)

unlockREDCap(c(test_conn    = 'TestRedcapAPI',
               sandbox_conn = 'SandboxAPI'),
               keyring      = 'MyKeyring',
               url          = 'https://redcap.vanderbilt.edu/api/') 

exportRecordsTyped crash on calc field

exportRecordsTyped crashes on a calc field.

exportRecordsTyped(rcon)
Error in exportRecordsTyped(rcon) : 1 assertions failed:
 * 'total_measures' choice string does not appear to be formatted for choices.

Vignette on 2.7.0 enhancements / best practices

Need a vignette on best practices for biostatisticians.

Possible outline:

  1. Introduction and context of changes and upgrades. Encourage participation and engagement.
  2. API_KEY security practices, get connection from crypto-locker. Possibly make rccola compatible.
  3. exportRecordsTyped with examples
  • Highlight of new validation failure report
  1. post-processing enhancements
  • recastRecords
  • guessCast
  • guessDate
  • mChoiceCast
  • widenRepeating
  • dropRepeatingNA

importRecords compatibility with exportRecords

This is just an open question. With exportRecordsTyped near complete, what can be done to make it compatible with importRecords? Is anything required? One issue I can see is the need to drop the mChoice variables.

missingSummary Function (Potential Problem with exportRecords?)

Hi there -- so appreciate this library! Hoping this is the right place to ask this:

On first glance, the missingSummary ancillary function for redcapAPI works perfectly. However, when cross-checking true missingness with REDcap open, there are a number of branching logic values where conditions are met, missingness still occurs, but the code doesn't report the missing value. Is it possible this is related to one of the options in the exportRecords function being the wrong value or perhaps the function isn't pulling in all available data?

Here's a link to the missingSummary gist: https://gist.github.com/nutterb/501c370418abb58bee78

Support for units

If the field_annotation matches the regex "units\s+=\s+"([^\)]+)" use the captured group to attach as an attribute "units" for that field.

export Reports not working

In a project where a report has and ID of '1' and name of '1' the following code does not export the report:

report <- exportReports(rcon, 1)

Instead it returns error message:

Error in redcap_error(x, error_handling) : 
  403: ERROR: The API request cannot complete because report_id=1 does not belong to this project.

In the API Playground the report is able to be exported correctly.

Delete Records

User has reported that delete records does not work. Need to investigate.

Remove tidyr dependency

exportUsers makes use of tidyr::spread. This could be accomplished with reshape and the tidyr dependency removed.

exportBundle bug

A user reported that exportBundle doesn't work for any project. They said it always fails with 'replacement has zero rows.'

redcapAPI currently does not pull factor labels for REDCap fields of "sql" type

For REDCap fields that are of "sql" type, it would be helpful if the metadata could include the labels associated with the numerical values in the corresponding SQL database. For example, see the variable med01_name in the tibble below:

Screenshot 2023-03-21 at 12 57 25 PM

The corresponding metadata is as follows (focus on the select_choices_or_calculations column):

image

redcapAPI references the corresponding database's project ID and field name, but does not pull in the levels and labels as it would if it were, for example, a dropdown field (e.g, "1, Yes | 0, No | -8888, Prefer not to answer").

Strip HTML from labels on export methods that report labels.

Idea > add filterLabel argument to exportRecords function. default is function stripHTML (code below)

Consider all export methods in package.

Internal ref: BSTATGEN-1095

Current html stripping code in use by user:

if(rmhtml) {
    trans <- function(x) {
      rem <- c('<p>', '</p>', '</div>', '</span>', '<p .*?>', '<div .*?>', '<span .*?>',
               '<br>', '<br />', '\\n')
      for(a in rem) x <- gsub(a, '', x)
      x
      }
    for(v in names(d)) {
      lab <- attr(d[[v]], 'label')
      if(length(lab)) setattr(d[[v]], 'label', trans(lab))
    }
  } 

exportRecordsTyped fails when retrieving only one record

library(redcapAPI)

rcon <- redcapConnection(url = url,
token = API_KEY)

exportRecordsTyped(rcon,
records = 1:2)

exportRecordsTyped(rcon,
records = 1)

Error in .exportRecordsTyped_getNas(na = na, field_types = field_types, :
User supplied na method for [ ] not returning vector of logical of correct length

Failing Tests

With the flurry of merges going into 2.7.0 and the reorganization some tests are breaking.

Error (test-06-fileRepo-functionalityR.R:14:1): (code run outside of `test_that()`)
Error in `devtools::test()`: 1 assertions failed:
 * Variable 'doc_id': Must have length 1, but has length 0.
Backtrace:
 1. redcapAPI::exportFromFileRepository(...)
      at test-06-fileRepo-functionalityR.R:14:0
 2. redcapAPI:::exportFromFileRepository.redcapApiConnection(...)
      at redcapAPI/R/exportFromFileRepository.R:37:2
 3. checkmate::reportAssertions(coll)
      at redcapAPI/R/exportFromFileRepository.R:85:2
...
✖ | 2 2     8 | reconstituteFileFromExport.R [2.2s]   
──────────────────────────────────────────────────────
Warning (test-reconstituteFileFromExport.R:24:5): File is saved to the directory
cannot open file '/tmp/Rtmp8CjWmi/text/csv; charset=utf-8': No such file or directory
Backtrace:
 1. redcapAPI::reconstituteFileFromExport(RESPONSE, dir = EXISTENT_DIR)
      at test-reconstituteFileFromExport.R:24:4
 2. base::writeBin(...)
      at redcapAPI/R/reconstituteFileFromExport.R:87:2
 3. base::file(con, "wb")

Error (test-reconstituteFileFromExport.R:24:5): File is saved to the directory
Error in `file(con, "wb")`: cannot open the connection
Backtrace:
 1. redcapAPI::reconstituteFileFromExport(RESPONSE, dir = EXISTENT_DIR)
      at test-reconstituteFileFromExport.R:24:4
 2. base::writeBin(...)
      at redcapAPI/R/reconstituteFileFromExport.R:87:2
 3. base::file(con, "wb")

Warning (test-reconstituteFileFromExport.R:41:5): New directory is created and file is saved to that directory
cannot open file '/tmp/Rtmp8CjWmi/NotAFolderYet/text/csv; charset=utf-8': No such file or directory
Backtrace:
 1. redcapAPI::reconstituteFileFromExport(...)
      at test-reconstituteFileFromExport.R:41:4
 2. base::writeBin(...)
      at redcapAPI/R/reconstituteFileFromExport.R:87:2
 3. base::file(con, "wb")

Error (test-reconstituteFileFromExport.R:41:5): New directory is created and file is saved to that directory
Error in `file(con, "wb")`: cannot open the connection
Backtrace:
 1. redcapAPI::reconstituteFileFromExport(...)
      at test-reconstituteFileFromExport.R:41:4
 2. base::writeBin(...)
      at redcapAPI/R/reconstituteFileFromExport.R:87:2
 3. base::file(con, "wb")

Add 'drop' argument to exportRecords

Add a 'drop' argument to exportRecords and any other export method that it makes sense.

Purpose: blinded users would like to be able to access REDCap data without risk of seeing blinded fields.

Internal ticket ref BSTATGEN-1094

Remove stringr dependency

stringr::str_split_fixed is only used in validate_import_select_dropdown_radio and validate_import_checkbox. With a little care, it could be replaced with strsplit and this dependency could be removed.

Note that strsplit_fixed has a feature that allows a string to be split into at most $n$ parts. We'll want to be careful with strsplit, as it would break a code-label mapping of "1, Tire, bicycle" into three parts, even though we want to map "1" to "Tire, bicycle."

checkbox_suffix needs no version argument

The function has the version argument, but it never gets used in the function. We'll want to check where this gets called and make sure the functions calling it aren't passing it the version. exportRecords and importRecords are the most likely to call this. Maybe exportReports.

Also, https://github.com/vubiostat/redcapAPI/blob/main/R/checkbox_suffixes.R#L39 has a very impressively esoteric regular expression. While it's a cool one-liner, we might be better served with something us mere mortals can understand. I propose

    opts <- meta_data$select_choices_or_calculations[meta_data$field_name %in% x]
    opts <- strsplit(opts, "[|]")
    opts <- unlist(opts)
    opts <- sub(",.+$", "", opts)
    opts <- trimws(opts)
    opts <- tolower(opts)

Remove apiCall and genericApiCall

As far as I can tell, these two functions are never called. They both look like misfired attempts to simplify assembling API calls. Their purpose is fulfilled with makeApiCall (introduced in #21 ). These can likely be deleted so long as the existing tests pass.

exportRecords fields parameter with a single value not working

These scripts were working as recently as 12:11 p.m. today, then suddenly started generating this error.

It’s happening with exportRecords, specifically the following:

Here’s how it is in my script, because I haven’t gone back to fix all these just to specify the arguments to exportRecords directly in drinkREDCap:


find_map_record_id <- function(

    key,

    ...) {

  redcapAPI::exportRecords(

    rcon = redcapConnection(url = rcon.api, token = key),

    fields = "record_id")

}

rccola::drinkREDCap(

  variables = "fhp_map",

  FUN = find_map_record_id)

I’ve also run this, which gives the same error:

rccola::drinkREDCap(

  variables = "fhp_map",

  fields = "record_id")

Which the error is:

400: ERROR: The "fields" parameter has been provided with a value, but the value is not an array. The "fields" parameter must be provided as an array of one or more values.

Server and locally, same error. Not sure what package versions Will is running on the server; I’ve got:

packageVersion("rccola")

[1] ‘1.0.4’

packageVersion("redcapAPI")

[1] ‘2.4.0’

Which I’m not sure if I’m up-to-date with what you’ve got on gihub.

Gonna write the API call directly, and see if that works, but prolly will take a minute, so any insight you may have in the meantime, much appreciated.

Strategies for REDCap timeouts

We needs a couple enhancements to deal with REDCap timeouts.

  1. A retry flag on methods. This would be the number of retries allowed for a call. Each call should be at a geometric progression of intervals, e.g. 3 seconds, 4 seconds, 8 seconds, 27 seconds, 1.4 minutes, 4 minutes, 12 minutes ... Maybe the progression value is taking from options.
  2. It might be possible to also expose socket timeout values. I don't know how feasible this is.

Heisen test

This fails for me about 1/4 of the runs:

Failure (test-08-logging.R:159:5): Logs are returned for an existing record
`all_record` is not TRUE

`actual`:   FALSE
`expected`: TRUE 

Develop importMappings method.

This method permits instruments to be mapped to events and would permit additional structural management in project design through the API.

Hmisc type conversion for multiple choice

If Hmisc is loaded, requireNamespace("Hmisc", quietly = TRUE), then convert multiple choice questions to the mChoice class.

This would make Hmisc as Suggests: and not a Depends:

## Find all variable names that are part of multiple choice sequences
## These names end in ___x with x being an integer
i <- grep('^.*___[0-9][0-9]*[0-9]*$', n)
i <- grep('^.*___.*$', n)
if(length(i)) {
      n <- n[i]

      basename <- sub('___[0-9][0-9]*[0-9]*$', '', n)
      basename <- sub('___.*', '', n)
      if(any(basename %in% names(d)))
        stop('base name for multiple choice variable has the same name as a non-multiple choice variable')

      for(v in unique(basename)) {
        V <- n[basename == v]
        numbers    <- sub(paste0('^', v, '___'), '', V)
        if(! all.is.numeric(numbers)) next
        numbers    <- as.integer(numbers)
        numchoices <- length(numbers)
        first <- paste0(v, '___', min(numbers))
        d[, (v) := do.call('mChoice', c(.SD, ...)), .SDcols=V]
        setattr(d[[v]], 'label', label(d[[first]]))
        d[, (V) := NULL]
         cred <- rbind(cred,
                      data.frame(name=v,
                                 description='variables combined into mChoice variable',
                                 details=paste(numchoices, 'original variables')))
      }
}

exportRecords without factors leads to bugged redcapFactors

tobadia wrote:

Hi @nutterb,
In a recent experiment I realized that disabling the factors flag in exportRecords results in some unexpected warnings when attributing labels. This leads to NA's appearing during data processing, with a bunch of:

In print.factor(x) : factor levels must be "character"

or

In class(x) <- class(x)[!class(x) %in% c("labelled", "redcapFactor")] :
  NAs introduced by coercion

This seems to happen rather late during the export, around here in makeRedcapFactor(). before giving that class, just printing the content of x returns its untouched value. After allocation of this class, it starts failing, with R somehow mistaking it for an actual factor ?

This puzzles me a bit because there's no reason for R to interprete x as a "factor" here.

edit: This may actually be caused by the call to print.factor() in the print.redcapFactor() function. When x is not a factor, obviously print.factor() fails with a warning.

The field choice string does not appear to be formatted for choices 2.6.1 failure

User reported error:

1 assertions failed:

  • The field choice string does not appear to be formatted for choices.

There’s a good bit of HTML code in the label values in this project (and most / all of the projects built by these guys), so that seemed like a possible culprit here – but I excluded all of the vars with any HTML, and get same error.

> subset(dd, field_name %in% c("minor1", "minor2", "cover_ltr_status"), select = c("field_name", "field_type", "select_choices_or_calculations"))

           field_name field_type

3              minor1   checkbox

4              minor2   checkbox

7148 cover_ltr_status   checkbox

                                                                                                                                                                      select_choices_or_calculations

3    1, Check here if you are a <b>parent</b> interested in obtaining information about inherited cancer risk and genetic testing for your child (i.e., a child <b>younger than 18 years of age</b>)

4                                                                   1, Check here if you are referring the <b>parent of your minor patient</b> (i.e., a patient <b>younger than 18 years of age</b>)

7148                                                                                                                                            1, Cover letters generated (read-only; auto-updated)

Remove use of the bundle object in function calls

The ability to cache objects into the redcapConnection makes the bundle obsolete. The bundle arguments in function calls may be removed without impacting backward compatibility.

Note: exportBundle will not be removed under this issue, as we know it is still in use by some users (See Issue #43 )

Known functions where the bundle may be passed as an argument are

  • allocationTable
  • deleteFiles
  • exportFieldNames
  • exportFiles
  • exportReports
  • exportUsers
  • importFiles
  • missingSummary
  • exportRecords
  • importRecords

Request for a function similar to redcapFactorFlip [post process records]

But not with redcapFactors.

Discussion with a user on nutterb#16 suggests this kind of functionality is in use. In that particular example, there were 47 variables to be recoded of different types. Managing those directly through exportRecordsTyped would be laborious.

At minimum, the function will need to receive the data to be recoded, the rcon object, and the name of the codebook to use from the meta data.

I like recodeData(data_frame, rcon, fields).

I'm open to suggestions for a better name.

Yes/No fields not properly cast in exportRecordsTyped

Just noticed this one. I'm guessing because the default cast for yesno is castLabel, but there's no coding to pass in the metadata?

library(redcapAPI)

rcon <- redcapConnection(url = url, 
                         token = API_KEY)


Records <- exportRecordsTyped(rcon)
Raw <- exportRecordsTyped(rcon, cast = raw_cast)

Records$prereq_yesno
 [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
[19] <NA> <NA> <NA> <NA> <NA> <NA>
attr(,"label")
[1] Pre-requisite as a yes/no
Levels: 

Raw$prereq_yesno
 [1] NA  NA  NA  "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "1" "1" "0" "0" "0" "0" "0" "0" "0" NA 
[24] NA 
attr(,"label")
[1] "Pre-requisite as a yes/no"

Form fails to export with POSIXct error

A long running routine has recently stopped working when extracting data from REDCap. The code is unchanged, I am informed that the REDCap study is unchanged, although the data are always changing.

The following call results in the error.
Error in as.POSIXct.numeric(records[[i]], format = "%Y-%m-%d") :
'origin' must be supplied

dataset <- as.data.frame(exportRecords(
rcon,
factors = FALSE,
forms = f[i],
records = subjectIDs,
events = NULL,
labels = FALSE,
dates = TRUE,
survey = TRUE,
dag = FALSE,
checkboxLabels = FALSE,
colClasses = NA,
batch.size = -1,
bundle = getOption("redcap_bundle"),
error_handling = getOption("redcap_error_handling"),
form_complete_auto = TRUE)
)

When the dates option is changed to FALSE, the error still shows, but the extract succeeds (and the resulting column is text).

I do see some dates field values with -5, -8 (study assigned values). The date fields in REDCap have validation_type as "date_mdy". Any ideas if this is a bug, data error, work arounds?

thanks!

Unusual datetime classing

Got this email from a user:

Using the R REDCap API on a form I'm looking at the variables hoendat and hoentim. I'm using the R drinkREDCap function from rccola.

For hoendat I'm getting a variable of R classes labelled, POSIXct, and POSIXt. I don't understand the POSIXt part and was expecting a pure date variable with no time to be of class Date.

For hoentim I'm getting class labelled and times with an additional attribute 'format' with value h:m:s. But the variable values are fractions (I assume fraction of a day). 

I'd appreciate getting any further information about why these are set up this way. The most important thing to know is whether the same setup is used for all dates and times in the whole database, and that this is automatic without the user having an opportunity to introduce undesired variability in how dates and times are stored in the project database. I can deal with any format but need to have consistency.

Parsing checkbox options with underscores in the coding

In doing some testing for #82 I ran into an interesting conundrum. I set up a field with the options

1, Peanut
b, Walnut
xyz, Cashew
-4, Almond

REDCap codes that last checkbox option to the field names "checkbox____4" (four underscores). It converts any non-alphanumeric characters to an underscore. This causes our usual string regex for getting the field name base sub("___.+$", "", field_names) spits out c(1, b, xyz, 4), which is incorrect.

I've added a constant REGEX_CHECKBOX_FIELD_NAME <- "^(.*?)___(?!.*___)(.*)$" which allows usage

sub(REGEX_CHECKBOX_FIELD_NAME, "\\1", x) # to get the field name base
sub(REGEX_CHECKBOX_FIELD_NAME, "\\2", x) # to get the option

and this seems to handle everything I can throw at it with one exception. If the user codes an option with three underscores (x___y, X then Y), the exported field name becomes "checkbox___x___y" and the regex returns "checkbox___x___" and "y", respectively.

I don't think we can use regex to save the users from themselves in this case. The only ways I can think of to address this require use of the meta data, which we don't always have available in what is currently written (that could change, I'm just thinking path of least resistance).

Anyway, I don't know exactly what can, should, or we are willing to do about it, but I thought I'd throw it out there for some future consideration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.