Comments (8)
@DimCitus Hmm. I wasn't sold about the "losing precision" argument but I didn't consider the NaN and Infinity cases. I will review the PR #255 soon.
from wal2json.
Are you referring to all numeric data types?
I must admit that in the context of replication, a way to bypass parsing numbers entirely and instead having the guarantee that the number representation is the one that Postgres expects would be a tremendous feature. So yes, all numeric data types actually... so that I don't even have to think about it...
from wal2json.
It does make sense. I wouldn't imagine enable it only for numeric
but not bigint
because you can have the same issue.
from wal2json.
The JSON spec says nothing about this. The exact number representation is the one provided by Postgres. You didn't show the data types here but I bet that column "f_numeric" has a numeric
data type. The data types real
and double precision
follow the IEEE 754 standard so it provides compact representation for such numbers.
postgres=# create table bar (a double precision);
CREATE TABLE
postgres=# insert into bar (a) values(1012345000999912345001230123445566.00000000);
INSERT 0 1
postgres=# select a from bar;
a
------------------------
1.0123450009999124e+33
(1 row)
postgres=# create table baz (a numeric(50,8));
CREATE TABLE
postgres=# insert into baz (a) values(1012345000999912345001230123445566.00000000);
INSERT 0 1
postgres=# select a from baz;
a
---------------------------------------------
1012345000999912345001230123445566.00000000
(1 row)
wal2json won't represent numbers as strings. That boat has sailed. Besides that the JSON spec (ECMA-404) provides a number as a valid JSON value so we should use it. What you are suggesting is that nobody won't do math with numbers and that JSON libraries are not prepared for big numbers. For the former, you are wrong or you are using the wrong data type in your database. For the latter, it is a weak argument; if the issue is in the JSON library that you are using, you should complain to the JSON library developers and not wal2json or Postgres. The JSON format provides a strongly typed system that avoid issues with typing rules and unpredictable or erroneous results.
from wal2json.
What you are suggesting is that nobody won't do math with numbers
You could do any math with numbers, but you should do it once you cast the number to some bignum/bigdec type, which is not native for most of the languages.
For the latter, it is a weak argument; if the issue is in the JSON library that you are using, you should complain to the JSON library developers
E.g. JavaScript (JSON.parse
coerces anything to double). E.g. Python, it does the same, even though it has native bignum type (but not bigdec). Surely, you could complain about whole surrounding world. You could raise issue to node, you could raise issue to V8, you could raise issue to Python devs, but complaining about everything isn't perfect way to deal with issues like that.
Even in strongly typed languages most libraries will coerce the number to double if they don't have type information that they should'nt. Just because if you will try to parse everyting as bignum, you will get huge performance hit on 99.9999999% of JSON that don't contain such bignums. This means that such JSON will not survive parsing it as some generic JsonNode
. If your solution does not work with most of the ecosystem, this is not an 'issue in the JSON library'. And this is why RFC has such paragraph.
The JSON format provides a strongly typed system that avoid issues with typing rules and unpredictable or erroneous results.
This is why the issue exists, because that 'strongly typed system' is inable to differentiate IEEE numbers from big numbers, and many JSON libraries will misinterpret later as former.
You can make it opt-in via slot option or via next format version, but best practice is to explicitly convert big numbers as strings, because there is a lot of real-world software that will misinterpret any number as an IEEE number, leading to a lot of silent precision losses and data corruptions.
from wal2json.
Hi all,
Please also consider NaN and Infinity values (positive, negative) that are converted to NULL by wal2json at the moment. See also dimitri/pgcopydb#127 where we're having trouble with the current number processing done in wal2json. I still need to dive into the situation more, my first reaction is that an option in wal2json to send numbers as their Postgres string representation would be good.
What do you think @eulerto ?
from wal2json.
Also given the variety of JSON parsers available in the wild, I must admit I would feel comfortable with an option that always output numeric values as JSON strings, using the exact string representation that Postgres would use itself in pg_dump and pg_restore, in a way that my replication client using wal2json is known safe even against parsing bugs or incompleteness in the JSON lib I happen to have found with the right license and the right language...
from wal2json.
Are you referring to all numeric data types?
from wal2json.
Related Issues (20)
- Getting table name in double Quotes HOT 2
- Segmentation fault HOT 1
- ERROR: could not load library "C:/Program Files/PostgreSQL/9.5/lib/wal2json.dll": The specified module could not be found. HOT 1
- START_REPLICATION command does not work with wal2json options HOT 1
- NaN values are received as "null" for NUMERIC type while capturing change data HOT 4
- Invalid JSON with non-transactional message HOT 5
- WAL record received in different formats when the table name contains the single quote HOT 1
- Add support for update_replication_progress introduced in pg15
- Install failed on Alpine 15 HOT 2
- ProgramLimitExceeded plugin wal2json HOT 2
- include-pk can't work with identity full? HOT 2
- Official instructions to build from source for production use HOT 1
- Change data not captured HOT 3
- LSN not picking from "withStartPosition" in format version-2 HOT 2
- Building dll for windows HOT 4
- Build and Install In RDS
- wal2json_16 is not available in CentOS7 HOT 1
- Unable to compile wal2json on Mac M1 HOT 1
- Installcheck in a loop eventually fails
- Best practice to handle "no old tuple data for UPDATE in table"? HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wal2json.