GithubHelp home page GithubHelp logo

ukdataservice / qamd Goto Github PK

View Code? Open in Web Editor NEW
18.0 5.0 4.0 388 KB

QAMyData, a data quality assurance tool for SPSS, STATA, SAS and CSV files.

License: MIT License

Rust 98.27% Shell 0.29% Makefile 0.46% JavaScript 0.98%
spss stata data-quality qa readstat quality assurance

qamd's Introduction

QAMyData

Build Status

QAMyData offers a free easy-to-use tool that automatically detects some of the most common problems in survey and other numeric data and creates a โ€˜data health checkโ€™, assisting with the clean up of data and providing an assurance that data is of a high quality.

Getting Started

See the Wiki for more information!

Authors

  • Myles Offord

See also the list of contributors who participated in this project.

License

QAMyData is licensed under the MIT license Copyright (c) 2019 University of Essex (except where otherwise noted)

See the file LICENCE for the full license.

Acknowledgments

  • WizardMac / Evan Miller for the amazing ReadStat C library

qamd's People

Contributors

dependabot[bot] avatar lyrain avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

qamd's Issues

HTML Output Default

Make the HTML output deafult. Output format option will still accept HTML & JSON

Cannot compile qamd

ns@bach ~/q/qamd> cargo build --release
Compiling qamd v1.0.0 (/home/ns/qamd/qamd)
error[E0502]: cannot borrow *context as mutable because context.variables is also borrowed as immutable
--> src/check/post.rs:188:16
|
174 | for variable in context.variables.iter()
| ----------------- immutable borrow occurs here
...
188 | dictionary(context, ValueLabelSpellcheck, &words, spellcheck_predicate);
| ^^^^^^^ mutable borrow occurs here
189 | }
| - immutable borrow ends here

error: aborting due to previous error

Sampling of cases prior to checking

Checks such as Regex can take some time to execute with larger files,especially on older or more resource restricted machines.To assuage this issue a subset of the data could be checked by 'slicing'the file prior to running the checks. The user would be required to configurethe size of the sample as a percentage.

CSV support

Hello! I believe I first heard about QAMyData from @janetm in April and I just took another look after seeing QAMyData and Dataverse mentioned in the same slide at https://twitter.com/alina_danciu_/status/1177496046181634048

EFdOaRwUUAAgbOw

As a developer, I came to this GitHub repo first and saw from the description that QAMyData has support for three proprietary formats (SPSS, STATA and SAS):

Screen Shot 2019-09-27 at 6 02 58 AM

I was a little sad that CSV isn't supported (I also checked the README and the wiki) but from looking at https://www.ukdataservice.ac.uk/about-us/our-rd/qamydata.aspx I see that CSV is supported! ๐ŸŽ‰ ๐ŸŽ‰

Screen Shot 2019-09-27 at 6 00 48 AM

I think the fix here in GitHub is pretty easy. I would suggest mentioning CSV (and all formats you support) in the description and the README.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.