philipdelff / nmdata Goto Github PK

View Code? Open in Web Editor NEW

16.0 16.0 6.0 38.66 MB

Prepare and document data for Nonmem - and automatically retrieve results

Home Page: https://philipdelff.github.io/NMdata/

License: Other

R 33.11% AMPL 0.02% TeX 12.81% Shell 0.02% CSS 0.02% Visual Basic 6.0 0.16% HTML 50.65% NMODL 3.22%

nonmem pharmacometrics r

nmdata's People

Contributors

Stargazers

Watchers

Forkers

mattdowle billdenney rnaimehaom tdhock mattfidler andersone1

nmdata's Issues

new test failure using data.table github master

hi @philipdelff
When I run R CMD check on NMdata with the new version of data.table from github master, I get the following new error which is not present when using data.table release version from CRAN:

* checking tests ...
  Running 'spelling.R'
  Running 'testthat.R'
 ERROR
Running the tests in 'tests/testthat.R' failed.
Last 13 lines of output:
   2.   \-testthat::expect_known_value(..., update = update)
  -- Failure ('test_NMscanInput.R:152'): CYCLE=DROP ------------------------------
  `nm1` has changed from known value recorded in 'testReference/NMscanInput_7.rds'.
  Component "input.colnames": Attributes: < Names: 1 string mismatch >
  Component "input.colnames": Attributes: < Length mismatch: comparison on first 2 components >
  Component "input.colnames": Attributes: < Component 2: Attributes: < target is NULL, current is list > >
  Component "input.colnames": Attributes: < Component 2: Numeric: lengths (24, 0) differ >
  Backtrace:
      x
   1. \-testthat::expect_equal_to_reference(nm1, fileRef, version = 2) at test_NMscanInput.R:152:4
   2.   \-testthat::expect_known_value(..., update = update)
  
  [ FAIL 6 | WARN 0 | SKIP 0 | PASS 210 ]
  Error: Test failures
  Execution halted

Looking at your tests, it seems that you expect that some computed value from your code is equal to the value stored in a RDS file. The input.colnames table has a new attribute named "index" which is causing the error, since the stored RDS value has no such attribute.
Can you please update your tests and/or code so that this ERROR goes away, and then submit a new version to CRAN?
This will help facilitate releasing a new version of data.table to CRAN. (data.table devs must make sure all reverse dependencies do not break, before submitting a new version to CRAN)

Compatibility with the development version of tibble

This package no longer passes its checks with the current tibble release candidate, perhaps triggered by tidyverse/tibble#1574 (now keeping attributes named "x" and "n" after new_tibble() and as_tibble()). Please see https://github.com/tidyverse/tibble/blob/main/revdep/problems.md#nmdata for details.

Should we be doing things differently? Can you please take a look and, if appropriate, submit an update to CRAN? Thanks!

Feature Request: Add `col.dv`, `col.mdv`, and `col.amt` arguments to `NMcheckData()`

This is somewhat related to #30 (in that it's about column naming in NMcheckData()).

I like how NMcheckData() does not require the default column names as stated in #30, I often use names that are closer to the SDTM and ADaM source data names for easier tracking back to the origin.

With that, it appears that NMcheckData() has column renaming for many but not all columns that are checked. The columns that do not appear to have the ability to use other names that I've found so far appear to be col.dv, col.mdv, and col.amt. col.evid also doesn't appear to exist, but that doesn't seem like a name I'd use something different for. (So, col.evid may be of interest for completeness, but it's not a real need to me.)

A simple workaround that I'm doing right now is simply renaming the columns as they go into NMcheckData() which is not a significant hardship.

more precise duplicate checking

Hello,

My name is Eric Anderson and I work at Metrum Research Group as a Data Scientist. For starters, I just want to mention that I find this package very interesting and useful.

The function I have been exploring the most is NMdata::NMcheckData().

I have a potential feature request regarding the duplicate checking. It seems like right now the function checks across these columnsID, CMT, EVID, TIME.

Sometimes I work with data sets that have additional columns that define unique rows (e.g. DVID, DRUGID, etc). Have you considered adding an argument to the function that allows for this?

Cannot use NMscanData with `use.input=FALSE`

I'm trying to make a reprex for this, but the quick issue is:

When I use the following to load several data files, I get the following error:

NMdata::NMscanData("NONMEM/PK_akrv2/nash18.lst", use.input=FALSE)
# Warning in NMreadTab(meta[I, file], quiet = TRUE, tab.count = tab.count,  :
#   Duplicated column names found: DV. Cleaning.
# Warning in NMreadTab(meta[I, file], quiet = TRUE, tab.count = tab.count,  :
#   Duplicated column names found: ET_QCENTP. Cleaning.
# Error in file.exists(file.mod) : invalid 'file' argument

When I traced the error a bit, it appears that file.mod is NULL. There are several places within NMscanData() where file.exists(file.mod) is called, and I don't immediately see which is causing the problem.

Feature request: NMcheckData look for periods

Most of the NONMEM data that I work with codes missing data as a period ("."). When I make an error with how I'm loading the data in R (mainly, when I'm not thinking about the loading carefully and just use read.csv() without modification), I will have those periods in the data.

I think it would be a useful feature to have NMcheckData() check and see if cells in the data have periods and suggest that maybe those should be converted to NA with code like type_convert(data, na=c(".", "NA")).

`col.id` not used with `NMcheckData()`

As you can tell, I'm using NMdata on some real projects now. So, I'm having lots of good thoughts about it! :)

I often use nonstandard names for NONMEM column names because I prefer keeping them closer to the SDTM and ADaM-like names in source data. With that, I found that the col.id argument does not appear to be used with NMcheckData() based on the fact that it says the ID column is not found (when I think it should not be expected):

library(NMdata)
#> Warning: package 'NMdata' was built under R version 4.1.3
#> Welcome to NMdata. Best place to browse NMdata documentation is
#> https://philipdelff.github.io/NMdata/
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.1.3
#> Warning: package 'tibble' was built under R version 4.1.3
#> Warning: package 'dplyr' was built under R version 4.1.3
dat <- readRDS(system.file("examples/data/xgxr2.rds", package="NMdata"))

dat2 <-
  dat %>%
  rename(
    USUBJIDN=ID
  )
NMcheckData(dat2, col.id="USUBJIDN")
#>  column              check  N Nid
#>    EVID Subject has no obs 19   0
#>      ID   Column not found  1   0
#>     MDV   Column not found  1   0

^{Created on 2022-05-17 by the reprex package (v2.0.1)}

Feature Request: NMscanData() to give more info on error

I just got the error "After applying filters to input data, the resulting number of rows differ from the number of rows in output data"

It would help track down the source of the error if the number of rows in the input and output were reported. (I realize that the NMdata-preferred solution is to use a ROWID column. I'm trying to work within someone else's data management for the moment.)

My preferred error would look something like the following:

After applying filters to input data, the resulting number of rows differ (input = 123 rows) from the number of rows in output data (output = 456 rows)

Extract subsets of data related to NMcheckData findings

After running NMcheckData, it would be convenient to have a way to extract subsets of data for plotting or "data scrolling". I am starting a discussion on what such a function could look like. For a "row-level" finding, one may want to extract all data related to the subjects affected, and plot involved columns.

Inputs appreciated!

philipdelff / nmdata Goto Github PK

nmdata's People

Contributors

Stargazers

Watchers

Forkers

nmdata's Issues

new test failure using data.table github master

Compatibility with the development version of tibble

Feature Request: Add `col.dv`, `col.mdv`, and `col.amt` arguments to `NMcheckData()`

more precise duplicate checking

Cannot use NMscanData with `use.input=FALSE`

Feature request: NMcheckData look for periods

`col.id` not used with `NMcheckData()`

Feature Request: NMscanData() to give more info on error

Extract subsets of data related to NMcheckData findings

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs