GithubHelp home page GithubHelp logo

datim-validation's Introduction

Introduction

This package has been developed to assist the PEPFAR Data Exchange Community with validation of their data import payloads. Various checks have been developed to allow users to identify and validate their import payloads prior to submission. In order to use this package, you will need an active account on DATIM. Other pre-requisites are listed in the DESCRIPTION file of this package.

datim-validation

You will need at least version 4.1 of R installed.

To get started, be sure you have installed the devtools package with

install.packages('devtools')

Once you have installed that, you can proceed to try and load the datim-validation package with

library(devtools)
install_git('https://github.com/pepfar-datim/datim-validation')

If you prefer, you can download the source of the package with the vignettes and build these as well (recommended for new users).

devtools::install_github('https://github.com/pepfar-datim/datim-validation',build_vignettes = TRUE)

This package uses renv, which allows for fine grained control of dependencies. While not required, it is reccomended that you use renv to ensure that you versions of dependencies in your local enviornment are the same as those which a particular version of datimvalidation has been tested against. You can install renv with the following command.

install.packages("renv")

Once renv has been installed, you can load all of the depencies as defined in the lock file with the following command.

renv::restore()

This command will restore and build a local copy of all dependencies in an isolated environment on your machine, and will not affect the global R environment.

A very basic script is provided below as an example of how you can validate your data payload. You should consult the datimutils package documentation for details regarding usage of the loginToDATIM function. Also consult this packages function documentation for more information on each validation function.

require(datimvalidation)
require(datimutils)
require(magrittr)
datim_config <- "/home/littebobbytables/.secrets/datim.json"
loginToDATIM(config)


#Adjust this to point to the location of your exchange file.
datafile <- "/home/littebobbytables/mydata.csv"
d <- d2Parser(filename = datafile, type = "csv", datastream = "RESULTS") 
#Run all reccomended validations. Note, this process may take several minutes.
d <- runValidation(d)

You can check any messages which may have been generated during the parsing and validation of your data.

print(d$info$messages)

Any messages with a level of "ERROR" generally indicate problems which must be fixed prior to import. Messages with a level of "WARNING" may prevent your data from being imported, and should be carefully reviewd. Messages with a level of "INFO" are provided for your information only.

If you export your data to CSV file with the following command.

write.table(d$data$import, 
file="my_export_file.csv",
quote=TRUE,sep=",", na="")

Note that this file will will only contain data which has been deemed to be valid by the parser. For instance, duplicate and blank rows from the original file will be removed. Also, all identifiers will be converted to IDs.

datim-validation's People

Contributors

bangadennis avatar cnemarich avatar hiwot-chichaybelu avatar jason-p-pickering avatar vshioshvili avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datim-validation's Issues

Error in sims parser R script

Error is being thrown when trying to run sims parser against the hrsa data file;

d=sims2Parser(file=filename)
Error in if ((start_date - end_date) < nrow(bar)) { :
missing value where TRUE/FALSE needed
In addition: Warning message:
In sims2Parser(file = filename) :
214 rows are incomplete. Please check your file to ensure its correct.

Data file is in zendesk ticket: https://datim.zendesk.com/agent/tickets/6189

Need to validate data element /orgunit assignments

Currently, it is not possible to validate compliance with data set assignments. We need to develop a map of orgunits/data elements, derived from data set assignments. Likely, this needs to be supported by an SQL view, which would provide the appropriate level of the Facility/Community grouping by OU. Otherwise, the map from the server could potentially be very large to get all orgunit group assignments by OU.

How to download data

Hi everyone,
Any clue in how to download data from a specific reporting period (dataelemnt, categoryOptionCombo, attributeOptionCombo, orgUnit, value) using R or Querying the Browser ?

Thanks
G

sims parser exporting assessment id

Jason, sims parser exports storedby/timestamp/comment only when there is shifting happening. When sims parser is applied, but there are no shifting, resulting data frame does not have those three columns. May I suggest adding support for that?

I believe it would only involve adding following three lines in the else statement on line 193, so instead of

else { data_shifted<-data }

it would be:

  else { 
    data_shifted<-data 
    data_shifted$storedby<-NA
    data_shifted$timestamp<-NA
    data_shifted$comment<-data_shifted$assessmentid
  }

Getting Invalid OrgUnit's script

Dear Jason
I'm using this repository to validated my XML file and I would like to validated the OrgUnit's code by running a script.

Can you please help ?

Green-EyE

bug in validateData

validateData returns the following error:

Error in getValidDataElements(datasets = ds) : object 'ds' not found

Is it this line?
validDataElements<-getValidDataElements(datasets=ds)

Issue with Authentication for APIs

Hi,

I have facing issues on trying to get data from datim-dev site api, when call is being made to https://dev-de.datim.org api.

Below is script that is used. Underlying issue is ,it failed in getDataElementMap function. Authentication failed and redirected to login page. Please help me to fix this issue. Appreciate your help.

###### Script that i am trying to execute

require(datimvalidation)
username = "XXXX"
password ="XXXX"
base.url = "https://dev-de.datim.org"
setwd("C:/WorkSpace/SIMS")
filename<-"C:/WorkSpace/SIMS/Testing/abovesite.csv"

organisationUnit = "IH1kchw86uA"

d<-d2Parser(file=filename,type = "csv",
organisationUnit = organisationUnit,
dataElementIdScheme = "code",orgUnitIdScheme = "id",idScheme = "code" ,invalidData = TRUE )

Function where issue is coming getDataElementMap
Line of code: r<-httr::GET(url,httr::timeout(60))

observations:

Determine technical approach for deprecated rules

Hi @jason-p-pickering

While validating Malawi's import file ZD10488, I noticed that datim-validation is subjecting the data to this validation rule

N.B Both rules refer to the same left side element

The data entry form in DATIM applies the correct rule as seen below

correct validation rule

Please advise.
Thanks.

FYI @talexie @ManishUNC

Saving function on windows machine throws an exception

Hello Jason,
I am using a windows machine and running the R validation package. This runs okay and I get the out put but when I try to save I cannot run this command directly

write.csv(violation,file='uganda_violations_Sol.csv')
when I do I get this error

Error in .External2(C_writetable, x, file, nrow(x), p, rnames, sep, eol, :
unimplemented type 'list' in 'EncodeElement'

but when I first run the following commands in sequence then I am able to save the output.

violation<-plyr::colwise(as.character)(violation)
write.csv(violation,file='uganda_violations_Sol.csv')

Thankyou
Solomon Kununka (Uganda)

Validation Rule Error

Hi JP
The validateData is returning the following error

Error in $<-.data.frame(*tmp*, "orgUnit", value = character(0)) :
replacement has 0 rows, data has 3655

Any clue what it is happening ?

Thanks
G

Need to check that the mode of the files actually is correct

Prior to remapping, we actually need to check that the stated mode of the file is what it says that it is

Handle cases like this with data element mode = "code", but with a line of data like

{"value":303,"attributeOptionCombo":"k5e1sDGK1GP","categoryOptionCombo":"NULL","dataElement":"NULL",

Validate secrets file

Currently, if the secrets file is not properly constructed, e.g. a missing trailing slash in the baseurl, the scripts will not work.

Implement methods to validate the secrets file and provide feedback to the user.

error with mechanism validation method

When trying to validate mechanisms in parsed data frame, I get the following error:

getInvalidMechanisms(d2,ISO="2016Q4")
Error in getMechanismsMap(organisationUnit, ISO) : unused argument (ISO)

Returning strange error for getOrgnisationUnitMap

I'm getting a strange error after I load datavalidation, username, password, base.url and OrganisationUnit

getOrganisationUnitMap(base.url,username, password,organisationUnit)
Error in getOrganisationUnitMap(base.url, username, password, organisationUnit) :
unused arguments (username, password, organisationUnit)

Can you assist?

Thanks
G

Issue trying to run getValidOperatingUnits.R function

Hi
I'm trying to list all valid Operating Units using "getValidOperatingUnits", but I'm getting the following error :

check_oU <- getValidOperatingUnits (base.url, username, password)
Error: could not find function "getValidOperatingUnits"

Can clarify what is wrong ?

Thanks in advance
Green-EyE

sims parser invalidData bug

In sims2Parser, there is a bug that makes invalidData attribute pointless. On line 133, if statement checks if there are invalid records, and if true, removes them instead of using the invalidData parameter to decide; I believe it should be this:

if (!invalidData) {
  data <- data[!invalid.rows, ]
}

instead of current:

if (sum(invalid.rows)) {
  data <- data[!invalid.rows, ]
}

Invalid Mechanism

Hi Jason
I'm running " getInvalidMechanisms(d,base.url,username,password,organisationUnit, validityDate ='2016-09-29')" and it's returning one invalid Mechanism "18275". But I can see it on Datim Pivot table.

Can I have your assistance , please ?
image

Thanks
G

Cache resources

Currently, cacheable resources are not being cached. Look into ability to cache things, so that multiple calls to the same endpoint is not required.

Validation is not working for LAB Indiators

Dear team,
Validation rules check are failing return when I try to run only LAB indicators file:

vr_violations<-validateData(data = d,
return_violations_only = TRUE,
parallel = FALSE )
Warning message:
In validateData(data = d, return_violations_only = TRUE, parallel = FALSE) :
NAs introduced by coercion

Regards
Cicero

Need to test for payload validity

Currently, the script expects things like "attributeOptionCombo", but if "AttributeOptionCombo" is in the XML file, it will cause all sorts of errors. We need to validate that the attributes in JSON/XML payloads and column order in CSV payloads is actually correct.

File Validation error

Hi Jason,

Am getting the following error
Error in [.data.frame(data, ,1:3) : undefined columns selected

screen shot 2017-07-27 at 4 34 09 pm 2

validating mechanisms

Hey Jason,

checkMechanismPeriodValidity says: "This function will return an object of invalid mechanisms and periods"; but when running with a file that has an invalid mechanism (when importing file that has code for mechanisms, it converts all valid mechanisms to UIDs, but invalid stays same - e.g. I put in 17x269) it only flags periods, but does not flag invalid mechanisms.

sims parser flags invalid mechanisms on import, and shows a warning that record is being excluded, while d2parser imports it and checkMechanismPeriodValidity does not flag it.

Thanks,
Vlad

Need to validate against approvals

Data may be subject to approval level locking.

Use the approvals API to get (for a given mechanism or list of mechanisms) their current approval status.
If the approval status does not allow for data import at this point in time, a message should be displayed.

use of unwise characters in DHIS2 urls

In a number of places datim-validation library uses characters deemed unwise that Tomcat started to disallow. In 7.0.90 url containing [ and/or ] gets trimmed and results in a 400 request from DHIS2. In 7.0.85 Tomcat did not filter those out.

As DHIS2 uses [ and ] for providing array of field names (at least that is the main use that appears in the datim-validation), they need to be escaped.

I do not have a full list of places, but here is an example: getDataElementMap.R include [ and ] in the url, and with tomcat 7.0.90, calling getDataElementMap function results in a null, as opposed to the data element map in 7.0.85.

supporting header-less file in d2parser

Jason,

In sims workflow I use both d2parser and sims2parser. I use d2parser output for two reasons: check if there are duplicates, and also to have "original" period-assessment id mapping. sims2parser shifts dates, and one of the outputs agencies asked is summary of assessments that got shifted; i use outputs of the two parsers to generate that.

in the original SIMS guidance agencies were told that they can send files with header missing if they so desire. Unfortunately, most agencies are following that suggestions. sims2parser has a boolean option that treats missing header appropriately. d2parser does not have that, and ends up ignoring the first row if the header is not present.

Would it be possible to add missing header option to d2parser as well, similar to sims2parser? I could try to make the change and do a pull request if that would make things easier.

Need to check orgunit-dataset assignment viability

In any given payload, we may have a combination of different dataset assignment combinations. We need to test for validity of the orgunit-data element combinations, based on data set assignments and then merging this with available data elements per data set.

sims parser date shift details

Hey Jason, I think it might be useful if sims2parser could spit out the shift actions that took place as the result of the method call; as the method returns altered data frame, I don't think it can return it, but i was thinking, when processing is done, before returning, it could print out count of affected assessments and the list of those affected (assessment id, old period, new period).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.