GithubHelp home page GithubHelp logo

philipdelff / nmdata Goto Github PK

View Code? Open in Web Editor NEW
16.0 16.0 6.0 38.66 MB

Prepare and document data for Nonmem - and automatically retrieve results

Home Page: https://philipdelff.github.io/NMdata/

License: Other

R 33.11% AMPL 0.02% TeX 12.81% Shell 0.02% CSS 0.02% Visual Basic 6.0 0.16% HTML 50.65% NMODL 3.22%
nonmem pharmacometrics r

nmdata's People

Contributors

andersone1 avatar billdenney avatar mattdowle avatar philipdelff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nmdata's Issues

new test failure using data.table github master

hi @philipdelff
When I run R CMD check on NMdata with the new version of data.table from github master, I get the following new error which is not present when using data.table release version from CRAN:

* checking tests ...
  Running 'spelling.R'
  Running 'testthat.R'
 ERROR
Running the tests in 'tests/testthat.R' failed.
Last 13 lines of output:
   2.   \-testthat::expect_known_value(..., update = update)
  -- Failure ('test_NMscanInput.R:152'): CYCLE=DROP ------------------------------
  `nm1` has changed from known value recorded in 'testReference/NMscanInput_7.rds'.
  Component "input.colnames": Attributes: < Names: 1 string mismatch >
  Component "input.colnames": Attributes: < Length mismatch: comparison on first 2 components >
  Component "input.colnames": Attributes: < Component 2: Attributes: < target is NULL, current is list > >
  Component "input.colnames": Attributes: < Component 2: Numeric: lengths (24, 0) differ >
  Backtrace:
      x
   1. \-testthat::expect_equal_to_reference(nm1, fileRef, version = 2) at test_NMscanInput.R:152:4
   2.   \-testthat::expect_known_value(..., update = update)
  
  [ FAIL 6 | WARN 0 | SKIP 0 | PASS 210 ]
  Error: Test failures
  Execution halted

Looking at your tests, it seems that you expect that some computed value from your code is equal to the value stored in a RDS file. The input.colnames table has a new attribute named "index" which is causing the error, since the stored RDS value has no such attribute.
Can you please update your tests and/or code so that this ERROR goes away, and then submit a new version to CRAN?
This will help facilitate releasing a new version of data.table to CRAN. (data.table devs must make sure all reverse dependencies do not break, before submitting a new version to CRAN)

Feature Request: Add `col.dv`, `col.mdv`, and `col.amt` arguments to `NMcheckData()`

This is somewhat related to #30 (in that it's about column naming in NMcheckData()).

I like how NMcheckData() does not require the default column names as stated in #30, I often use names that are closer to the SDTM and ADaM source data names for easier tracking back to the origin.

With that, it appears that NMcheckData() has column renaming for many but not all columns that are checked. The columns that do not appear to have the ability to use other names that I've found so far appear to be col.dv, col.mdv, and col.amt. col.evid also doesn't appear to exist, but that doesn't seem like a name I'd use something different for. (So, col.evid may be of interest for completeness, but it's not a real need to me.)

A simple workaround that I'm doing right now is simply renaming the columns as they go into NMcheckData() which is not a significant hardship.

more precise duplicate checking

Hello,

My name is Eric Anderson and I work at Metrum Research Group as a Data Scientist. For starters, I just want to mention that I find this package very interesting and useful.

The function I have been exploring the most is NMdata::NMcheckData().

I have a potential feature request regarding the duplicate checking. It seems like right now the function checks across these columnsID, CMT, EVID, TIME.

Sometimes I work with data sets that have additional columns that define unique rows (e.g. DVID, DRUGID, etc). Have you considered adding an argument to the function that allows for this?

Cannot use NMscanData with `use.input=FALSE`

I'm trying to make a reprex for this, but the quick issue is:

When I use the following to load several data files, I get the following error:

NMdata::NMscanData("NONMEM/PK_akrv2/nash18.lst", use.input=FALSE)
# Warning in NMreadTab(meta[I, file], quiet = TRUE, tab.count = tab.count,  :
#   Duplicated column names found: DV. Cleaning.
# Warning in NMreadTab(meta[I, file], quiet = TRUE, tab.count = tab.count,  :
#   Duplicated column names found: ET_QCENTP. Cleaning.
# Error in file.exists(file.mod) : invalid 'file' argument

When I traced the error a bit, it appears that file.mod is NULL. There are several places within NMscanData() where file.exists(file.mod) is called, and I don't immediately see which is causing the problem.

Feature request: NMcheckData look for periods

Most of the NONMEM data that I work with codes missing data as a period ("."). When I make an error with how I'm loading the data in R (mainly, when I'm not thinking about the loading carefully and just use read.csv() without modification), I will have those periods in the data.

I think it would be a useful feature to have NMcheckData() check and see if cells in the data have periods and suggest that maybe those should be converted to NA with code like type_convert(data, na=c(".", "NA")).

`col.id` not used with `NMcheckData()`

As you can tell, I'm using NMdata on some real projects now. So, I'm having lots of good thoughts about it! :)

I often use nonstandard names for NONMEM column names because I prefer keeping them closer to the SDTM and ADaM-like names in source data. With that, I found that the col.id argument does not appear to be used with NMcheckData() based on the fact that it says the ID column is not found (when I think it should not be expected):

library(NMdata)
#> Warning: package 'NMdata' was built under R version 4.1.3
#> Welcome to NMdata. Best place to browse NMdata documentation is
#> https://philipdelff.github.io/NMdata/
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.1.3
#> Warning: package 'tibble' was built under R version 4.1.3
#> Warning: package 'dplyr' was built under R version 4.1.3
dat <- readRDS(system.file("examples/data/xgxr2.rds", package="NMdata"))

dat2 <-
  dat %>%
  rename(
    USUBJIDN=ID
  )
NMcheckData(dat2, col.id="USUBJIDN")
#>  column              check  N Nid
#>    EVID Subject has no obs 19   0
#>      ID   Column not found  1   0
#>     MDV   Column not found  1   0

Created on 2022-05-17 by the reprex package (v2.0.1)

Feature Request: NMscanData() to give more info on error

I just got the error "After applying filters to input data, the resulting number of rows differ from the number of rows in output data"

It would help track down the source of the error if the number of rows in the input and output were reported. (I realize that the NMdata-preferred solution is to use a ROWID column. I'm trying to work within someone else's data management for the moment.)

My preferred error would look something like the following:

After applying filters to input data, the resulting number of rows differ (input = 123 rows) from the number of rows in output data (output = 456 rows)

Extract subsets of data related to NMcheckData findings

After running NMcheckData, it would be convenient to have a way to extract subsets of data for plotting or "data scrolling". I am starting a discussion on what such a function could look like. For a "row-level" finding, one may want to extract all data related to the subjects affected, and plot involved columns.

Inputs appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.