GithubHelp home page GithubHelp logo

Comments (8)

cmrn-rhi avatar cmrn-rhi commented on July 17, 2024

Testing "Sample Collection Date Unit"

Branch: data-bucket
Testing Date: 2021-02-10

Have just done some testing on sample collection date unit and haven't found any issues with importing, copy-pasting, using the picklist to enter and validate values. All work fine. No matter how I add the date unit, it appears to automatically reformat the sample collection date with 01 pseudo values (before the validation step). E.g. if I import 2020 it becomes 2020-01-01 when I select year, or if I paste 2020 it becomes 2020-01-01 even if no unit has been selected.

The only usability concern I have is that if someone is adding sample collection date unit within the DataHarmonizer, after already having sample collection dates, they could accidently overwrite values in their sample collection date. E.g. I have 2020-02-18 and then accidently select month instead of year the date changes to 2020-02-01.

from dataharmonizer.

cmrn-rhi avatar cmrn-rhi commented on July 17, 2024

I tested importing with all eligible file types using modified (and updated) versions of the validTestData as well as the test file provided by damion. However, when I did some tests on the modified (and updated) version of the invalidTestData I noticed 2020 wasn't automatically converting to 2020-01-01. I tried seeing what would happen if I paired 2020 with year, month, and day and the result was the following:

2020|day;2020-01-01|year;2020-__01|month

Not certain why this is happening, but fortunately the validation process will always catch and draw attention to these occurences.

Edit:
Input: DH1311p_collection-date-unit_test-05 (invalid data - 2020 testing).csv
Output: DH1311p_collection-date-unit_test-05-output (invalid data - 2020 testing).csv

from dataharmonizer.

cmrn-rhi avatar cmrn-rhi commented on July 17, 2024

Test Files:

DH Test_2021-02-10 (sample collection date unit).zip

  • DH1311p_collection-date-unit_test-01 (valid data).csv
  • DH1311p_collection-date-unit_test-02 (damion's file).xlsx
  • DH1311p_collection-date-unit_test-03 (improper date unit pairs).csv
  • DH1311p_collection-date-unit_test-04 (invalid data).csv
  • DH1311p_collection-date-unit_test-05 (invalid data - 2020 testing).csv
  • DH1311p_collection-date-unit_test-05-output (invalid data - 2020 testing).csv
  • DataHarmonizer-exampleInput_0.13.11 (pre-release)
    • invalidTestData.csv
    • validTestData.csv
    • validTestData.tsv
    • validTestData.xls
    • validTestData.xlsx

I only saved the output when there were unexpected results, in the future I will include the output regardless of the results.

Edit:
"DH" stands for "DataHarmonizer"
"DH1311p" stands for "DataHarmonizer version 0.13.11 pre-release"

from dataharmonizer.

ddooley avatar ddooley commented on July 17, 2024

So I've made a change that when a spreadsheet is loaded, the program will stop trying to automatically correct dates into a yyyy-mm-dd format, e.g. "2020" in a date field was getting converted into 2020-01-01 on load, but now it remains 2020. That way a user will be able to manually adjust any date rather than program making assumptions about what it should be converted to. The values will trigger validation error to highlight ones that need correction.

The reason a "day" setting kept 2020 as-is is I didn't want to make assumptions about setting day and month component of what was only a year.
Similarly for month, its prompting user for month when only a year is given. In that case it assumes day is 01.

from dataharmonizer.

ddooley avatar ddooley commented on July 17, 2024

Also, we have it that no changes are automatically made any more to month/year/day granularity (did this by renaming the "sample collection date unit" field to "sample collection date precision", since the program still involkes the auto-update on any date + unit field. Instead, any given date is converted to the given date granularity only on export to a particular target database.

from dataharmonizer.

cmrn-rhi avatar cmrn-rhi commented on July 17, 2024

Export Testing

I doubled checked this (while testing the CanCOGen-vocabulary-fix branch) and sample collection date precision combined with sample received date behaved as you described when imported and exported.

Example

Import:

DH1315_CNPHI-Export_test-01-input (sample collection date precision)

Export (CNPHI):

DH1315_CNPHI-Export_test-01-output (sample collection date precision)

Attachments:
DH-Test_2021-02-21 (CNPHI Export - date precision).zip

from dataharmonizer.

cmrn-rhi avatar cmrn-rhi commented on July 17, 2024

Date Auto-Update Concern

The program is invoking the auto-update on any date field, not just the date + unit field pairs.

The following fields have the auto-date formatting to ensure there are value for year, month, and day - but they don't have a paired precision date column to clarify that these are not actually dated "YYYY-01-01".

  • symptom onset date
  • vaccination date
  • most recent travel departure date
  • most recent travel return date
  • sequencing date

Attachments:
DH-Test_2021-02-21 (CanCOGeN vocabulary fix).zip

from dataharmonizer.

ddooley avatar ddooley commented on July 17, 2024

The date auto-format function (which would be applied to all dates in a loaded spreadsheet on load) has been removed from date fields across the board, so malformed dates remain as is and are only highlighted when one presses "Validate".

from dataharmonizer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.