GithubHelp home page GithubHelp logo

Missing values about readstat HOT 8 CLOSED

wizardmac avatar wizardmac commented on June 18, 2024
Missing values

from readstat.

Comments (8)

evanmiller avatar evanmiller commented on June 18, 2024

Missing strings are the zero-length string but the treatment of missing numeric values is currently inconsistent. The DTA parser returns NULL but the others return a NaN. The SAS parser returns the data representation in the file (a NaN, but it might be a tagged NaN?) while the others return a system NaN.

from readstat.

hadley avatar hadley commented on June 18, 2024

What about integers? Also seems a bit dangerous to encode a missing strings as zero-length

from readstat.

evanmiller avatar evanmiller commented on June 18, 2024

Only the DTA parser returns integers which is why it uses the NULL convention rather than NaNs.

I think using NULL in all cases might make sense. READSTAT_TYPE_MISSING loses information about the underlying type, which puts more of a burden on the client to keep track of the column types.

from readstat.

hadley avatar hadley commented on June 18, 2024

Using NULL seems reasonable to me. Another option would be to provide readstat_value_is_missing() or similar so the representation could be changed in the future.

from readstat.

evanmiller avatar evanmiller commented on June 18, 2024

In the meantime we could do both. A new readstat_value_t type could be a possibly NULL pointer, which we could change to a struct later.

from readstat.

evanmiller avatar evanmiller commented on June 18, 2024

Ok, the callbacks now all receive NULL for missing numeric values:

737b816

String are trickier since I believe none of these file formats distinguish between zero-length strings and missing strings. (RData might be an exception.)

from readstat.

hadley avatar hadley commented on June 18, 2024

Oh interesting. In that case, leaving as empty strings seems reasonable to me.

from readstat.

evanmiller avatar evanmiller commented on June 18, 2024

Closing

from readstat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.