GithubHelp home page GithubHelp logo

Comments (10)

headius avatar headius commented on July 19, 2024

Reference jnr/jffi#34.

from jnr-ffi.

Spasi avatar Spasi commented on July 19, 2024

Hey @headius,

I'm afraid my testing has shown that Critical Natives do not improve performance in primitive-only functions. They are really only a solution for efficiently passing array parameters.

I too was incredibly excited at first when I heard about it. In LWJGL we have hundreds of JNI functions and they all are primitive-only. Direct NIO buffers are used as pointers to data, but we only pass and return their addresses to functions, never instances (via Unsafe, we do not call JNI's GetDirectBufferAddress). Based on the post on stackoverflow I was expecting much lower overhead calling such functions, but unfortunately that is not the case:

  • Using Critical Natives for functions that accept arrays is indeed much faster, meaning that a function with Java arrays is as fast as the same function with NIO buffers (passed as addresses). This has the benefit that, overall, compute on array + critical native is slightly faster than compute on buffer + standard native (on Java 8 at least, buffers may catch up in 9). Obviously, arrays are also more convenient to use.
  • Using Critical Natives for primitive-only functions is not any faster than standard JNI. The biggest difference I could measure with JMH was sub-nanosecond (maybe 1-2 CPU cycles). Afaict, critical natives skip work that is already skipped in standard JNI when no jobject/synchronized is involved.

(please confirm this, I would love to be proven wrong)

FWIW, I think there's some room for improvement. One experiment I did was to create a custom JDK 9 build that had a hacked version of Critical Natives. Basically, I (naively/dangerously) removed everything that didn't seem absolutely necessary for calling a primitive-only function. For example, it wasn't changing the thread state from Java to native and back. The build worked and I could measure a significant reduction in overhead, almost 40% (from ~9ns to ~5ns for a no-arg function).

Anyway, Critical Natives is a nice trick for arrays. It would be great to magically get better performance for primitive-only calls (and native-to-Java upcalls, they're horrible) in Java 8u/9, but it would be hard to justify the engineering cost with Project Panama on the way.

from jnr-ffi.

headius avatar headius commented on July 19, 2024

@Spasi Wow, ok...lots here. I'll address what I can at 2AM :-)

My interesting cases for using JavaCritical are probably different from LWJGL's: I want trivial functions like getpid to be closer to their raw C cost; I want to bounce back and forth across that boundary manipulating native structs/pointers with minimal cost; I want to efficiently implement library wrappers that are entirely non-blocking but which depend on rich native structures. Most of the operations I expect to see benefit from this are nearly trivial...JNI overhead is by far the lion's share.

Arrays will be a great unexpected bonus. I did not realize that object pinning was a reality in current HotSpot at all, and the ability to actually directly access arrays of primitives will serve us extremely well.

Another point of difference is that on your C side, you're calling normal functions in a normal C way. My interest is JNR...using the same endpoint to call an arbitrary number of C functions. Anything I can do to allow users to reduce overhead on what is essentially reflective calls will have an impact.

I also have no idea how much chatter LWJGL has across that JNI boundary, but JRuby (and JRuby+Truffle) is moving rapidly toward having many key, core operations implemented entirely atop native functions: IO, filesystem access, potentially crypto and more.

I'm definitely aware of what Panama could provide us, and my other I-have-no-time-for-it pet project is to do a Panama backend for jnr-ffi. But Panama may be difficult or impossible to access in Java 9, and there's a whole EG+JSR process needed to even consider it as a public API in 10. We need better options now.

FWIW, I'd really love to find some ways to share efforts between LWJGL and JNR. Any incarnation of Panama will require thoughtful consideration of API structure, and us collaborating more would be a great way to figure out what that API should look like for both a real-world project and a low-level tool other projects are built upon.

I hope I will have time to hack some critical calls into jffi+jnr-ffi in the near term, but time is a hard stallion to break. I will say that I'm very excited about the possibilities.

from jnr-ffi.

headius avatar headius commented on July 19, 2024

Oh, I forgot an interesting use case we still dream about: implementing the Ruby C extension API so much overhead from the JNI interface. Those would be more "normal" JNI calls, but then we could at least have a fighting chance of running those extensions at a similar speed to the fast-and-loose C Ruby.

from jnr-ffi.

Spasi avatar Spasi commented on July 19, 2024

My interesting cases for using JavaCritical are probably different from LWJGL's: I want trivial functions like getpid to be closer to their raw C cost; I want to bounce back and forth across that boundary manipulating native structs/pointers with minimal cost; I want to efficiently implement library wrappers that are entirely non-blocking but which depend on rich native structures. Most of the operations I expect to see benefit from this are nearly trivial...JNI overhead is by far the lion's share.

What we have seen is that Critical Natives do not lower the overhead of simple functions like getpid. In fact, we tested functions that do absolutely nothing and there was no real difference between critical and standard JNI.

The above perfectly describes what LWJGL does and JNI overhead is a pain for us too. Not in all bindings, but some APIs require frequent, low-complexity calls and any overhead hurts. For example, Vulkan is a much more verbose API than OpenGL.

Another point of difference is that on your C side, you're calling normal functions in a normal C way. My interest is JNR...using the same endpoint to call an arbitrary number of C functions. Anything I can do to allow users to reduce overhead on what is essentially reflective calls will have an impact.

There are two cases in LWJGL:

  1. Libraries that are bundled with LWJGL as static binaries (e.g. lmdb) are called using normal JNI code.
  2. Libraries loaded dynamically are called using deduplicated JNI methods. Otherwise our native binaries would be massive.

The major difference is that JNR does 2 dynamically and in LWJGL it's generated statically, based on a fixed set of supported APIs.

I also have no idea how much chatter LWJGL has across that JNI boundary, but JRuby (and JRuby+Truffle) is moving rapidly toward having many key, core operations implemented entirely atop native functions: IO, filesystem access, potentially crypto and more.

This is the list of bindings we currently support and this is the plan for future bindings. We avoid C++ APIs and C APIs that are heavy on callbacks (too much overhead, Cliff Click mentioned that they're always interpreted?).

We need better options now.

Agreed.

FWIW, I'd really love to find some ways to share efforts between LWJGL and JNR. Any incarnation of Panama will require thoughtful consideration of API structure, and us collaborating more would be a great way to figure out what that API should look like for both a real-world project and a low-level tool other projects are built upon.

That'd be great. The LWJGL design has been driven by what JVMs can do right now. Everything's going to change with Panama (implementation-wise) and Valhalla (API-wise, major type-safety wins with value types and some simplifications with generic specialization). But yes, I'd be glad to share our experience with various native APIs and how to best approach usability and safety issues.

from jnr-ffi.

DemiMarie avatar DemiMarie commented on July 19, 2024

@headius Struct and pointer operations can be done using Unsafe, without entering native code at all. The methods of Unsafe are marked as native, but are actually intrinsics that compile to the same code you would get from a C compiler.

This is a case where the GPLv2 (with no linking exception) licensing of Java 9's compilation interface is a problem. If it could be changed by Oracle that would be awesome (they did that for Truffle), but that seems unlikely.

from jnr-ffi.

Spasi avatar Spasi commented on July 19, 2024

I've been doing a lot of testing lately and have a few things to report.

First, we encountered two bugs related to critical natives and have reported them (with corresponding fixes):

Second, I took the opportunity to weigh some of the overhead in the JNI wrappers. The parts that, by removing them, make a measurable difference:

  • GC check before the call, ~0.7ns
  • DTrace method probes (e.g. this) before and after the call, ~1.5ns
  • Thread state transitions around the call and safepoint check after the call, ~1.85ns

Removing a few more things (ic check on entry and restoring of CPU control state after the call), brings the total overhead reduction to ~4.66ns. That means a function like getpid could go from ~8.1ns to ~3.5ns. All tests were performed on a Sandy Bridge 3.1GHz (so YMMV) with a fresh build of JDK 9.

Some of the above are scary, others are just annoying (sigh... the DTrace probes). FWIW, I tested a build that removed the above only for JNI functions that were primitive-only and was able to complete the entire LWJGL test and demo suite without any issue.

from jnr-ffi.

 avatar commented on July 19, 2024

@Spasi doesn't look like the 2 bugs you filed will be fixed any time soon. Will this be a concern in using this "feature"? The 2 bug reports indicate it's only used in Solaris (for the JDK) so no issue and deferred.

from jnr-ffi.

Spasi avatar Spasi commented on July 19, 2024

We have implemented workarounds in LWJGL for both:

  • For JDK-8167408: LWJGL/lwjgl3@234f169 (exports functions without __stdcall decorations on Windows x86)
  • For JDK-8167409: LWJGL/lwjgl3@ee39d2a (disables Critical Natives on problematic function signatures only, on Linux & macOS only)

from jnr-ffi.

chrisvest avatar chrisvest commented on July 19, 2024

Just checked. JDK-8167408 and JDK-8167409 are now marked as resolved/fixed in Java 10.

from jnr-ffi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.