GithubHelp home page GithubHelp logo

Comments (8)

eulerto avatar eulerto commented on June 3, 2024 2

@DimCitus Hmm. I wasn't sold about the "losing precision" argument but I didn't consider the NaN and Infinity cases. I will review the PR #255 soon.

from wal2json.

DimCitus avatar DimCitus commented on June 3, 2024 1

Are you referring to all numeric data types?

I must admit that in the context of replication, a way to bypass parsing numbers entirely and instead having the guarantee that the number representation is the one that Postgres expects would be a tremendous feature. So yes, all numeric data types actually... so that I don't even have to think about it...

from wal2json.

eulerto avatar eulerto commented on June 3, 2024 1

It does make sense. I wouldn't imagine enable it only for numeric but not bigint because you can have the same issue.

from wal2json.

eulerto avatar eulerto commented on June 3, 2024

The JSON spec says nothing about this. The exact number representation is the one provided by Postgres. You didn't show the data types here but I bet that column "f_numeric" has a numeric data type. The data types real and double precision follow the IEEE 754 standard so it provides compact representation for such numbers.

postgres=# create table bar (a double precision);
CREATE TABLE
postgres=# insert into bar (a) values(1012345000999912345001230123445566.00000000);
INSERT 0 1
postgres=# select a from bar;
           a            
------------------------
 1.0123450009999124e+33
(1 row)

postgres=# create table baz (a numeric(50,8));
CREATE TABLE
postgres=# insert into baz (a) values(1012345000999912345001230123445566.00000000);
INSERT 0 1
postgres=# select a from baz;
                      a                      
---------------------------------------------
 1012345000999912345001230123445566.00000000
(1 row)

wal2json won't represent numbers as strings. That boat has sailed. Besides that the JSON spec (ECMA-404) provides a number as a valid JSON value so we should use it. What you are suggesting is that nobody won't do math with numbers and that JSON libraries are not prepared for big numbers. For the former, you are wrong or you are using the wrong data type in your database. For the latter, it is a weak argument; if the issue is in the JSON library that you are using, you should complain to the JSON library developers and not wal2json or Postgres. The JSON format provides a strongly typed system that avoid issues with typing rules and unpredictable or erroneous results.

from wal2json.

makkarpov avatar makkarpov commented on June 3, 2024

What you are suggesting is that nobody won't do math with numbers

You could do any math with numbers, but you should do it once you cast the number to some bignum/bigdec type, which is not native for most of the languages.

For the latter, it is a weak argument; if the issue is in the JSON library that you are using, you should complain to the JSON library developers

E.g. JavaScript (JSON.parse coerces anything to double). E.g. Python, it does the same, even though it has native bignum type (but not bigdec). Surely, you could complain about whole surrounding world. You could raise issue to node, you could raise issue to V8, you could raise issue to Python devs, but complaining about everything isn't perfect way to deal with issues like that.

Even in strongly typed languages most libraries will coerce the number to double if they don't have type information that they should'nt. Just because if you will try to parse everyting as bignum, you will get huge performance hit on 99.9999999% of JSON that don't contain such bignums. This means that such JSON will not survive parsing it as some generic JsonNode. If your solution does not work with most of the ecosystem, this is not an 'issue in the JSON library'. And this is why RFC has such paragraph.

The JSON format provides a strongly typed system that avoid issues with typing rules and unpredictable or erroneous results.

This is why the issue exists, because that 'strongly typed system' is inable to differentiate IEEE numbers from big numbers, and many JSON libraries will misinterpret later as former.

You can make it opt-in via slot option or via next format version, but best practice is to explicitly convert big numbers as strings, because there is a lot of real-world software that will misinterpret any number as an IEEE number, leading to a lot of silent precision losses and data corruptions.

from wal2json.

DimCitus avatar DimCitus commented on June 3, 2024

Hi all,

Please also consider NaN and Infinity values (positive, negative) that are converted to NULL by wal2json at the moment. See also dimitri/pgcopydb#127 where we're having trouble with the current number processing done in wal2json. I still need to dive into the situation more, my first reaction is that an option in wal2json to send numbers as their Postgres string representation would be good.

What do you think @eulerto ?

from wal2json.

DimCitus avatar DimCitus commented on June 3, 2024

Also given the variety of JSON parsers available in the wild, I must admit I would feel comfortable with an option that always output numeric values as JSON strings, using the exact string representation that Postgres would use itself in pg_dump and pg_restore, in a way that my replication client using wal2json is known safe even against parsing bugs or incompleteness in the JSON lib I happen to have found with the right license and the right language...

from wal2json.

eulerto avatar eulerto commented on June 3, 2024

Are you referring to all numeric data types?

from wal2json.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.