GithubHelp home page GithubHelp logo

Comments (4)

sinclairzx81 avatar sinclairzx81 commented on May 26, 2024

@apancutt Hi, thanks for reporting :)

It's interesting this issue was submitted for issue 754 (as the precision loss here is a consequence of how IEEE 754 handles floating point arithmetic) :) I see you have submitted a fix for this, but I'm somewhat reluctant to take on the fix as it would produce results different from the IEEE 754 result (even if the result is wrong, it is aligned to the IEEE 754 specification).

The same is true in other languages, for example, the following gives the same results in Rust

Rust Link Here

fn main() {
    // 32-bit
    println!("{:?}", 1.0_f32 % 0.1_f32); // 0.09999999
    
    // 64-bit
    println!("{:?}", 1.0_f64 % 0.1_f64); // 0.09999999999999995
}

Specifications

As number is intended to represent a Json number, It may be helpful to reference the Json specification. I note that the spec is somewhat loose in terms of how "the software" should interpret numeric values (as it can't assume the capabilities of the receiving software), however there is suggestion here that it is reasonable to assume either IEEE-754 - IEEE-754-2008 (64-bit), and that software should operate within the precision permitted by these specifications (if only for interoperability). I would assume this would extend to operations performed on parsed Json numerics

https://datatracker.ietf.org/doc/html/rfc7159

Numbers

...

   This specification allows implementations to set limits on the range
   and precision of numbers accepted.  Since software that implements
   IEEE 754-2008 binary64 (double precision) numbers IEEE754 is
   generally available and widely used, good interoperability can be
   achieved by implementations that expect no more precision or range
   than these provide, in the sense that implementations will
   approximate JSON numbers within the expected precision.  A JSON
   number such as 1E400 or 3.141592653589793238462643383279 may indicate
   potential interoperability problems, since it suggests that the
   software that created it expects receiving software to have greater
   capabilities for numeric magnitude and precision than is widely
   available.

   Note that when such software is used, numbers that are integers and
   are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
   sense that implementations will agree exactly on their numeric
   values.

I think the take away here is that TypeBox shouldn't implement additional "work" to fix numerical imprecision inherent with IEEE 754, as doing so would be working above and beyond any specification, and would require other software (like Rust) to implement the same fixes to be equivalent (which would be an interoperability concern).


Decimal Data Type

I think the correct way to fix this would be to introduce an actual Decimal type (with arithmetic applied to this type giving the correct result). For example, here is a C# implementation comparing various numeric types.

CSharp Link Here

{ // IEEE-754 (32-bit)
  float a = 1.0f;
  float m = 0.1f;

  Console.WriteLine("float {0}", a % m); // 0.09999999
}

{ // IEEE-754 (64-bit)
  double a = 1.0d;
  double m = 0.1d;

  Console.WriteLine("double {0}", a % m); // 0.09999999999999995
}
{ // Decimal (base-10 representation)
  decimal a = 1.0m;
  decimal m = 0.1m;

  Console.WriteLine("decimal {0}", a % m); // 0.0
}

Such a type would need to be a non-standard [JavaScript] type, and be principally aligned to the implementation used by C# (and compared with other languages that support decimal numeric representation). However, the work to implement such a type would almost certainly be outside the scope of TypeBox (but could be expressed with a custom type)


So, I don't think I'll be able to go ahead with this PR based on the above. But will leave the issue open for a day or so in case you want to discuss prior art (would be interested to see alternative implementations of this)

Cheers!
S

from typebox.

apancutt avatar apancutt commented on May 26, 2024

It's interesting this issue was submitted for issue 754 (as the precision loss here is a consequence of how IEEE 754 handles floating point arithmetic)

I've been waiting so patiently to raise this issue!

The spec doesn't specify that the modulus operator must be used, nor that the validation algorithm be IEEE 754 compliant. It only states a numeric instance is valid only if division by this keyword's value results in an integer.

TypeBox currently fails here, and I don't believe alignment with IEEE 754 is a valid justification for this.

Regardless, it turns out my PR doesn't work in all cases (e.g. value = 1.4, multipleOf = 0.1), which coincidentally fails in avj too so I suspect they're using a similar algorithm. jsonschema (from an author of the JSON Schema spec) has solved it using this formula that shifts decimals to integers prior to performing the modulus comparison.

Would you be open to a similar implementation? EDIT: I've updated the PR, just in case.

Alternatively, is there a pluggable way to override the built-in validation functions so that we can implement a local workaround?

Thanks

from typebox.

sinclairzx81 avatar sinclairzx81 commented on May 26, 2024

@apancutt Hi, thanks for the follow up.


TypeBox currently fails here, and I don't believe alignment with IEEE 754 is a valid justification for this.

Yeah, I mention IEEE-754 as this is the cause of the precision issue. My reluctance to update logic here mostly comes from TypeBox having to diverge from the result given using standard modulus arithmetic under JavaScript (which uses IEEE-754), and that multipleOf should ideally evaluate to whatever value % mod === 0 results in (even if the result is surprising, it's still consistent to the result given by JS arithmetic).

But I agree, it's not entirely ideal.


Regardless, it turns out my PR doesn't work in all cases (e.g. value = 1.4, multipleOf = 0.1), which coincidentally ajv-validator/ajv#652 so I suspect they're using a similar algorithm. jsonschema (from an author of the JSON Schema spec) has solved it using this formula that shifts decimals to integers prior to performing the modulus comparison.

Yeah, it's tricky. I believe Ajv has a configurable epsilon precision value (which isn't ideal tbh). The jsonschema implementation looks better tho (it might be worth extracting this logic into a simple multipleOf(x, y): boolean function). Something that can be tested away from library specifics.


Alternatively, is there a pluggable way to override the built-in validation functions so that we can implement a local workaround?

Yes, you can implement a custom type for this. Here's a quick example that implements a custom Decimal type, and uses the decimal.js package to perform the modulus check.

import { Type, TypeRegistry, Kind, TSchema, NumberOptions, ValueGuard } from '@sinclair/typebox'
import { Value } from '@sinclair/typebox/value'
import { Decimal as _Decimal } from 'decimal.js'

// -----------------------------------------------------------------
// Type: Decimal
// -----------------------------------------------------------------
export interface TDecimal extends TSchema, NumberOptions {
  [Kind]: 'Decimal'
  type: 'number',
  static: number
}
export function Decimal(options: NumberOptions = {}): TDecimal {
  return { ...options, [Kind]: 'Decimal', type: 'number' } as never
}
TypeRegistry.Set<TDecimal>('Decimal', (schema, value) => {
  return (
    (ValueGuard.IsNumber(value)) &&
    (ValueGuard.IsNumber(schema.multipleOf) ? _Decimal.mod(value, schema.multipleOf).eq(new _Decimal(0)) : true) &&
    (ValueGuard.IsNumber(schema.exclusiveMaximum) ? value < schema.exclusiveMaximum : true) &&
    (ValueGuard.IsNumber(schema.exclusiveMinimum) ? value > schema.exclusiveMinimum : true) &&
    (ValueGuard.IsNumber(schema.maximum) ? value <= schema.maximum : true) &&
    (ValueGuard.IsNumber(schema.minimum) ? value >= schema.minimum : true)
  )
})

// -----------------------------------------------------------------
// Usage
// -----------------------------------------------------------------

const T = Type.Object({
  value: Decimal({ multipleOf: 0.1 })
})

const R = Value.Check(T, { value: 1.0 })

console.log(R)

The above would be compatible with the Value, TypeCompiler and Error modules. If you don't want the additional dependency on decimal.js, you can try implement a multipleOf function that performs a similar check to the jsonschema package.

Does this help?


Um, let me give this a bit more consideration. If you can produce a concise and reliable multipleOf function (supporting arbitrary precision), I can possibly look at including this in a subsequent revision. It would need to be written as a standalone function (as the function would be used across multiple sub modules), and there would be some integration required (as the function would need to be emitted on the TypeCompiler)

It might be good to move this to a discussion thread in the interim tho.
Thoughts?

from typebox.

apancutt avatar apancutt commented on May 26, 2024

Thanks for the custom type snippet - that solves our immediate problem.

Moved to discussion: #757

from typebox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.