GithubHelp home page GithubHelp logo

Comments (39)

stephentyrone avatar stephentyrone commented on June 10, 2024 4

But why not also a BinaryInteger of non fixed width

It's just a separate issue.

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024 1

Sadly your simplification makes the program incorrect. Specifically, you cannot safely perform an unchecked addition of the carry bit in case that addition itself overflows. You can see this by running your code with these extensions:

var s1 = UInt128(high: 0, low: 1)
let s2 = UInt128(high: .max, low: .max)
s1.add(s2)
print("high: \(s1.high), low: \(s1.low)")

This will print `high: 0, low: 0", which indicates that we have overflowed, but the program did not crash, though it should have. The Rust program correctly crashes in this case.

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024 1

What are the odds that a generically written UInt128 will be optimised into use of the specialised instructions available on some supported platforms?

If for some reason this weren’t possible, we simply wouldn’t use a generic implementation. Part of the API contract for such a type should be that it is essentially optimal.

Good news, though: it is possible =)

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024 1

I want to stress here: I am satisfied that my desired outcome is possible. I really mean it when I say I believe you both! What I am trying to say here is that I do not see how to do it (a personal limitation, not something that limits either of you), and I haven’t seen it done yet, so I am currently operating entirely on faith, not evidence. 👍

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024 1

We're drifting way, way off topic. I looked into this. I think LLVM bug 36243 and 39464 are the same bug, and the same bug as this bug. In short, LLVM doesn't model status flags during generic (not target specific) abstract optimization passes. This design tradeoff has pros and cons. The "pro" is many programming languages (and some processors) don't expose status flags, so the compiler has to recognize idioms that are effectively checks of status flags. But the con is that LLVM easily "forgets" that a status flag might hold the answer it wants and then creating redundant "TEST" or "CMP' instructions.

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024 1

https://reviews.llvm.org/D70079

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024 1

@dabrahams There are a bunch of reasons not to use the DoubleWidth pattern. Most are minor, but two of them are significant:

  • Sizes that are not powers of two start to become interesting as you go past 128b. There are a bunch of real applications for e.g. 192 and 384b operations--rounding these up to 256 and 512b makes multiplication (the dominant operation in most bignum arithmetic usage) half as efficient as it could be.

  • For sizes past 256b, using the DoubleWidth pattern starts to put a lot more strain on the optimizer to re-organize code in multiplication to pull out optimal sequences. What you generally want to end up with is a loop over e.g. 1x8 or 2x8 tiles of partial products, doing mul-add or mul-mul-add-add chains with carry. Those natural tiles are spread across multiple leaves of the number representation in the DoubleWidth formulation--you can work around this by simply taking a pointer to self and reinterpreting as a buffer of words or similar, but then you're effectively not using the DoubleWidth structure.

My current inclination is to represent fixed-width storage in the type system using something double-width-ish, but represent the actual operations using the normal linear buffer algorithms. In small experiments I've run this seems to strike a good balance of simplicity and performance.

(The other advantage of using linear-buffer algorithms is that it makes it simpler to substitute in optimized bignum libcalls on platforms that have them, because that's closer to the representation that existing assembly and C libraries expect to traffic in.)

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024 1

I don't think this is unique to Swift either. While C++ can easily solve the first bullet via non-type template parameters, it still struggles with the second. In particular, the x86 backend optimizer needs some serious work in order to generate ideal code (via MULX, ADCX and ADOX). I'm not sure about Arm, but I'd imagine that their backend struggles too.

I think this just one of those [common] cases where generic code can race far ahead what the optimizer is capable of handling.

from swift-numerics.

Sajjon avatar Sajjon commented on June 10, 2024

I cannot emphasize enough how great it would be with Int256, UInt256, Int512, UInt512 etc! But why not also a BinaryInteger of non fixed width. There is already a prototype by Apple here. So far I’ve been relying on Great attaswift/BigInt by @lorentey , would be nice with BigUInt as well!

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024

What are the odds that a generically written UInt128 will be optimised into use of the specialised instructions available on some supported platforms?

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024

What are the odds that a generically written UInt128 will be optimised into use of the specialised instructions available on some supported platforms?

Do you have an example? I'm not aware of any processor with hardware support for 128-bit integers (that being said, newer PPC processors have 128-bit floating point). Or do you merely mean "use the carry/borrow flag/bit" available in most processors?

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024

Or do you merely mean "use the carry/borrow flag/bit" available in most processors?

Sorry, I should have been clearer: yes, I mostly meant that, though in the event some weirder platforms have actually got 128-bit integer operations that would be nice too. As a concrete example, here is a naive-but-reasonable implementation of 128-bit unsigned integer checked addition (that might be used when writing this generically across FixedWidthInteger), and the associated x86_64 assembly is:

output.UInt128.add(output.UInt128) -> ():
        push    rbp
        mov     rbp, rsp
        add     qword ptr [r13], rdi
        mov     rax, qword ptr [r13 + 8]
        jae     .LBB1_2
        add     rax, 1
        mov     qword ptr [r13 + 8], rax
        jb      .LBB1_4
.LBB1_2:
        add     rax, rsi
        mov     qword ptr [r13 + 8], rax
        jb      .LBB1_3
        pop     rbp
        ret
.LBB1_3:
        ud2
.LBB1_4:
        ud2

Whereas a Rust version gives us:

example::add:
        push    rax
        mov     r8, rdx
        mov     rdx, rsi
        mov     rax, rdi
        add     rax, r8
        adc     rdx, rcx
        jb      .LBB0_1
        pop     rcx
        ret
.LBB0_1:
        lea     rdi, [rip + .L__unnamed_1]
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2

I'd like us to have the Rust version rather than the naive Swift version if we could.

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024

If your example is simplified, then the code gen is almost optimal:

https://swift.godbolt.org/z/ES6j7q

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024

Thanks for clarifying @stephentyrone, that's good to know, especially as this was a limitation with DoubleWidth back when that was floating around.

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024

@Lukasa – Whoops. You're right. I was too focused on getting the right ASM out. Nevertheless, both Rust and Swift use LLVM, and the standard library integers are "just" a wrapper around LLVM intrinsics. We just need an intrinsic or pattern of intrinsics that can reliably lower to an "ADC" on x86 (or similar instructions on other ISAs). Rust is either using such an intrinsic, or their just using LLVM's i128 type. Swift can do the same. This isn't hard.

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024

Swift can do the same. This isn't hard.

I agree, unless the requirement is actually to use an intrinsic. If it is, writing the code generically gets pretty gnarly as you require compile time type checking to replace the generic slow path with the intrinisic fast path.

I also do not know how to use LLVM intrinsics safely from within a Swift package. If that’s straightforward then all is well, but if it isn’t then that raises further questions.

Hence the question: do we believe that LLVM can be convinced, either by the generic code or by means of LLVM intrinsic calls from a swift package? So far both you and @stephentyrone have said “yes”. You two are both more expert than I am, and so I believe you both, but I do think it is telling that so far no-one has actually convinced the Swift compiler to emit the pattern we want.

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024

Okay, more old-school, but it works this time. The compiler fails to elide a CMP instruction and a MOV instruction, but otherwise it's "perfect":

https://swift.godbolt.org/z/JHDVUq

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024

Yup, that looks a lot better. It’s a shame about the cmp but it seems like something that could be optimised away at least in principle.

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

The compiler fails to elide a CMP instruction and a MOV instruction

This would be a good bug to file if you can spare a minute to write it up (if not, let me know and I'll do it). I've probably already reported an equivalent issue, but I'm not finding it right now.

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024

Looks like various carry chain related bugs exist:

https://bugs.llvm.org/buglist.cgi?quicksearch=addcarry

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

A quick reading suggests that none of those is a great fit for this particular issue, which looks (to me) considerably easier to resolve.

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

(Hopefully not going too much further into the weeds) I think they're broadly the same class of bug, I just think that this particular case may be easier to address without completely solving the general problem. I'll poke around at some nebulous future point and see if that actually turns out to be true.

from swift-numerics.

drodriguez avatar drodriguez commented on June 10, 2024

I don’t know if the code will be useful for this PR, but I tried to implement Int128/UInt128, but I never had the time to put it out there. https://github.com/drodriguez/swift/tree/integers-128 has the code I had around 3 months ago rebased onto a current apple/swift master. I haven't checked that it compiles or pass the tests, but it used to.

My interest in this was to implement Float128, which is needed for interacting with long double in some platforms (like Linux AArch64). The code for https://github.com/drodriguez/swift/tree/float-128 is even older (around 1 year?). I think I never got it to work. I don’t think I have seen any PR with interest for f128, but if anyone needs inspiration…

I don’t know if I would be able to work a lot on those, so if anyone wants to take them as a starting point, be my guest (if there's a lot of interest, I will try to find some time myself, but I cannot promise anything).

from swift-numerics.

dabrahams avatar dabrahams commented on June 10, 2024

@Lukasa DoubleWidth is still floating around, and I would still like it form a basis for an (at least initial) implementation of this issue. @stephentyrone any reason why not?

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024

I think @davezarzycki convinced me it’s a decent basis for an initial implementation. I can always moan about it if I’m not happy with the codegen. 😉

from swift-numerics.

Lukasa avatar Lukasa commented on June 10, 2024

Also it’s hard to be too picky about work that others volunteer to do when you yourself are not volunteering to do it.

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024

Personally, I'd just update the integer GYB of the stdlib to export popular "big" sizes up to 4096, including popular half-step sizes too. In total, that's 11 new types: 128, 256, 512, 1024, 2048, and 4096 for the power-of-two sizes; and 192, 384, 768, 1536, and 3072 for the half-step sizes.

That'd cover the vast majority of the "big number" problem space, and with far, far less optimizer headaches. This would, of course, require some compiler-rt work, but most of that is semi-boilerplate.

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

There's an inherent tradeoff; the "use LLVM large integers" approach benefits from all the existing integer optimizations in LLVM, but perversely locks away some optimizations that become available when the array-of-words view is exposed to the compiler. The compiler could be taught to perform these, of course, but it's a non-trivial pile of work, and you get them for free if you vend the array-of-words representation. Long-term, it's probably the right approach, however.

You can probably stop at 512, honestly. Beyond that point the returns for taking advantage of fixed-sized-ness diminish rapidly, and as more and more crypto has shifted out of RSA and into multi-dimensional structures defined over smaller base fields, the large integer sizes are becoming less and less interesting anyway (not that crypto should be implemented in a general-purpose bignum library, but it is still useful for experimentation).

This would, of course, require some compiler-rt work, but most of that is semi-boilerplate.

For stuff going into the standard library, this requires either availability restrictions or bundling the new compiler-rt entrypoints into a support library for back-deployment, which is somewhat unfortunate, but not a dealbreaker.

The main appeal of not doing this (or at least, not only doing this) is that implementing bignum in Swift gives a bunch of useful stress cases for the optimizer that are representative of a domain of programming that we would like to be able to handle in the compiler.

from swift-numerics.

dankogai avatar dankogai commented on June 10, 2024

FYI I've written a package called swift-int2x a couple of years ago.

https://github.com/dankogai/swift-int2x

Thanks to SE-0104, making your own integer types is easier than ever. This module makes use of it -- creating double-width integer from any given FixedWidthInteger.

import Int2X

typealias U128 = UInt2X<UInt64> // Yes.  That's it!
typealias I128 = Int2X<UInt64>  // ditto for signed integers

from swift-numerics.

dabrahams avatar dabrahams commented on June 10, 2024

@dankogai That's the DoubleWidth approach that was mentioned earlier. @stephentyrone it seems to me that this pattern in still useful in that it offers a path to easily implemented and tested functionality that may be needed in contexts where the width of inputs is generic and it's important not to overflow. Even when the binary tree algorithms become inefficient they can at least be used to cross-check other implementations in testing. Linearizing the algorithms can always be done as an optimization step, and if you're writing generic algorithms over arbitrary-width integers you'll still need something like DoubleWidth to handle the results of multiplications.

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

For an arbitrary-width integer type,DoubleWidth would be Self. But sure, I'm not saying binary-tree based approaches are useless, rather that they're generally suboptimal.

from swift-numerics.

dabrahams avatar dabrahams commented on June 10, 2024

@stephentyrone if you want to go all technical precision on me… when you hear me say “generic algorithms over arbitrary-width integers” please hear that I mean generic algorithms over integers that are arbitrarily wide, not algorithms (why would they need to be generic?) over integers that have dynamic width.

To be 100% clear, I'm saying that something like DoubleWidth fills a niche that is covered neither by integers of dynamic width nor by an expanded library of fixed integer sizes.

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

@stephentyrone if you want to go all technical precision on me… when you hear me say “generic algorithms over arbitrary-width integers” please hear that I mean generic algorithms over integers that are arbitrarily wide, not algorithms (why would they need to be generic?) over integers that have dynamic width.

Apologies, I misinterpreted you; "arbitrary-width" is something of a term of art here, and usually refers to dynamic-sized integer arithmetic.

from swift-numerics.

dabrahams avatar dabrahams commented on June 10, 2024

Not a problem, Steve. I'm wondering, though, whether you still think DoubleWidth is a suboptimal avenue for expanding the integer widths available in the Swift standard library, and if so, why? As I've said, it seems like a useful and easily-implementable way to start that can always be optimized later and could eventually share a linearized implementation with a FixedSizeArray<Word>-based design that supports non-power-of-2 widths.

from swift-numerics.

davezarzycki avatar davezarzycki commented on June 10, 2024

Hi @dabrahams – I don't think there is anything wrong with DoubleWidth as a generic type. But generally speaking, I think this is one of those cases where it's really easy for the users of generics to race far ahead of what the back-end optimizations are able to pattern match. Said differently, you'll know the back-end is ready when we can replace UInt64 with DoubleWidth<DoubleWidth<DoubleWidth<UInt8>>>

from swift-numerics.

dabrahams avatar dabrahams commented on June 10, 2024

I have no illusions about the back-end's ability to optimize (it still leaves a lot to be desired), and in the case of DoubleWidth<DoubleWidth<DoubleWidth<UInt8>>>, either:

  • The implementation of operations is written in terms of the tree representation, and it could never hope to optimize perfectly because that's not the best algorithm for 8x width operations and I don't expect the optimizer to ever reconstruct a UInt64 division from the algorithm for division expressed in terms of that tree, or
  • The implementation of operations is linearized over Words, in which case there's every chance it might optimize perfectly today.

But whether the backend is ready for these types is totally beside the point. There are plenty of applications that just need the functionality of Int128 or Int512, no matter how poorly they optimize. And there are others that would benefit greatly in their development from being able to use a working DoubleWidth, and then dropping down to lower-level operations when they discover the need for better performance. Swift in general suffers from a lack of useful prepackaged abstractions, and withholding them from the standard library until they can be implemented optimally is counterproductive.

Imagine if we had done that with Codable! The results in this pitch show that performance could be far better than it is for many applications. Yet, people use Codable every day to get important work done.

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

Update: @benrimmington has added DoubleWidth to _TestSupport. I think that we'll want to make some changes before exposing them in a public module, but they are now present in the package for people to experiment with or use in testing.

from swift-numerics.

mgriebling avatar mgriebling commented on June 10, 2024

Not sure if I should start a new thread but I have created a proposed UInt128 module.
Where should I upload the proposal? @stephentyrone

Here's the basic interface for discussion:

UInt128.doccarchive.zip

from swift-numerics.

stephentyrone avatar stephentyrone commented on June 10, 2024

[U]Int128 probably fits better in the standard library itself (there's already a partial implementation there to support Duration.

from swift-numerics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.