GithubHelp home page GithubHelp logo

Comments (10)

jrouwe avatar jrouwe commented on June 24, 2024

Thanks for the report, I'll investigate.

from joltphysics.

jrouwe avatar jrouwe commented on June 24, 2024

Hello,

The errors in the referenced ticket don't seem to be related to enabling sse:

/opt/homebrew/Cellar/mingw-w64/12.0.0/toolchain-i686/lib/gcc/i686-w64-mingw32/14.1.0/include/xmmintrin.h: In member function 'float JPH::Vec3::GetX() const':
/opt/homebrew/Cellar/mingw-w64/12.0.0/toolchain-i686/lib/gcc/i686-w64-mingw32/14.1.0/include/xmmintrin.h:973:1: error: inlining failed in call to 'always_inline' 'float _mm_cvtss_f32(__m128)': target specific option mismatch
  973 | _mm_cvtss_f32 (__m128 __A)
      | ^~~~~~~~~~~~~

It's a warning about inlining which I already fixed in 4a08d8d (which is not in an official release yet).

The -mfpmath=sse flag seems to be sufficient for GCC to start creating SSE instructions. I've created a sample in godbolt. You can see that the generated assembly uses SSE instructions. If you remove the -mfpmath=sse flag, or replace it with -msse, it starts creating old style x87 instructions.

If this is the 64-bit section you're referring to:

elseif ("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "x86_64" OR "${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "AMD64")

then what it is checking is CMAKE_SYSTEM_PROCESSOR, which in my experience for x86 is the processor that is compiling the code, not what architecture it is targetting. I'm hoping you're not compiling on a 32-bit OS, but if you are, or if I'm wrong about the value, we can add an extra check for that.

from joltphysics.

SirLynix avatar SirLynix commented on June 24, 2024

Thanks for all the details!

I will use your inline fix to try and see if it fixes the build on MinGW on macOS.

If this is the 64-bit section you're referring to:

elseif ("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "x86_64" OR "${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "AMD64")

then what it is checking is CMAKE_SYSTEM_PROCESSOR, which in my experience for x86 is the processor that is compiling the code, not what architecture it is targetting. I'm hoping you're not compiling on a 32-bit OS, but if you are, or if I'm wrong about the value, we can add an extra check for that.

However I find this confusing, why is the processor compiling the code any relevant? Compilers running on 32bits system can produce 64bits binaries (and compilers running on 64bits system can produce 32bits binaries as well). Only the target architecture should matter.

I find the CMake doc a bit confusing on that as well

from joltphysics.

jrouwe avatar jrouwe commented on June 24, 2024

Yes, some of the cmake variables are unintuitive. How are you enabling the 32-bit build from cmake? If I simply add -m32 to the CMAKE_CXX_FLAGS then CMAKE_SYSTEM_PROCESSOR remains x86_64, which to me is logical since otherwise it would somehow need to parse the compiler options and change the value of CMAKE_SYSTEM_PROCESSOR on the fly based on this option.

from joltphysics.

SirLynix avatar SirLynix commented on June 24, 2024

yes from what I see xmake passes -m32 and set the compiler path.

/Users/runner/.xmake/packages/c/cmake/3.29.2/c83e9763d2e54cf3bc634700a848c33b/bin/cmake -DENABLE_ALL_WARNINGS=OFF -DINTERPROCEDURAL_OPTIMIZATION=OFF -DTARGET_UNIT_TESTS=OFF -DTARGET_HELLO_WORLD=OFF -DTARGET_PERFORMANCE_TEST=OFF -DTARGET_SAMPLES=OFF -DTARGET_VIEWER=OFF -DUSE_STATIC_MSVC_RUNTIME_LIBRARY=OFF -DOVERRIDE_CXX_FLAGS=OFF -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCROSS_PLATFORM_DETERMINISTIC=OFF -DDOUBLE_PRECISION=OFF -DGENERATE_DEBUG_SYMBOLS=OFF -DOBJECT_LAYER_BITS=16 -DUSE_AVX=OFF -DUSE_AVX2=OFF -DUSE_AVX512=OFF -DUSE_F16C=OFF -DUSE_FMADD=OFF -DUSE_LZCNT=OFF -DUSE_SSE4_1=OFF -DUSE_SSE4_2=OFF -DUSE_TZCNT=OFF -DCMAKE_INSTALL_PREFIX=/Users/runner/.xmake/packages/j/joltphysics/v5.0.0/d18c2625a84a4ebe80128018d7fb713c -DCMAKE_INSTALL_LIBDIR:PATH=lib -G "Unix Makefiles" -DCMAKE_C_COMPILER=/opt/homebrew/opt/mingw-w64/bin/i686-w64-mingw32-gcc -DCMAKE_OSX_SYSROOT= -DCMAKE_ASM_FLAGS=-m32 -DCMAKE_RC_COMPILER=/opt/homebrew/opt/mingw-w64/bin/i686-w64-mingw32-windres -DCMAKE_STATIC_LINKER_FLAGS= -DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=BOTH -DCMAKE_SHARED_LINKER_FLAGS=-m32 -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_FIND_ROOT_PATH_MODE_INCLUDE=BOTH -DCMAKE_ASM_COMPILER=/opt/homebrew/opt/mingw-w64/bin/i686-w64-mingw32-gcc -DCMAKE_CXX_FLAGS=-m32 -DCMAKE_RANLIB=/opt/homebrew/opt/mingw-w64/bin/i686-w64-mingw32-ranlib -DHAVE_FLAG_SEARCH_PATHS_FIRST=0 -DCMAKE_FIND_ROOT_PATH_MODE_PROGRAM=NEVER -DCMAKE_FIND_ROOT_PATH=/opt/homebrew/opt/mingw-w64 -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=BOTH -DCMAKE_CXX_COMPILER=/opt/homebrew/opt/mingw-w64/bin/i686-w64-mingw32-g++ -DCMAKE_C_FLAGS=-m32 -DCMAKE_AR=/opt/homebrew/opt/mingw-w64/bin/i686-w64-mingw32-ar -DCMAKE_EXE_LINKER_FLAGS=-m32 "-DCMAKE_CXX_FLAGS_RELEASE=-m32 -O3 -DNDEBUG" "-DCMAKE_C_FLAGS_RELEASE=-m32 -O3 -DNDEBUG" -DCMAKE_EXE_LINKER_FLAGS_RELEASE=-m32 -DCMAKE_SHARED_LINKER_FLAGS_RELEASE=-m32 -DCMAKE_STATIC_LINKER_FLAGS_RELEASE= /Users/runner/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Build

however, from the failed build log, there are also warnings about sse not being enabled

/Users/runner/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Build/../Jolt/Core/Profiler.inl:71:50: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi]
   71 |                 __m128i val = _mm_loadu_si128(src);
      |                                                  ^

notice also that no SSE option is passed to the compiler:

/opt/homebrew/opt/mingw-w64/bin/i686-w64-mingw32-g++ -DJPH_BUILD_SHARED_LIBRARY -DJPH_DEBUG_RENDERER -DJPH_OBJECT_LAYER_BITS=16 -DJPH_PROFILE_ENABLED -DJPH_SHARED_LIBRARY -DJolt_EXPORTS -DNDEBUG @CMakeFiles/Jolt.dir/includes_CXX.rsp -m32 -Wno-stringop-overflow -ffp-contract=off -m32 -O3 -DNDEBUG -std=c++17 -Winvalid-pch -include /Users/runner/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Build/build_d18c2625/CMakeFiles/Jolt.dir/cmake_pch.hxx -MD -MT CMakeFiles/Jolt.dir/Users/runner/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Core/JobSystemThreadPool.cpp.obj -MF CMakeFiles/Jolt.dir/Users/runner/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Core/JobSystemThreadPool.cpp.obj.d -o CMakeFiles/Jolt.dir/Users/runner/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Core/JobSystemThreadPool.cpp.obj -c /Users/runner/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Core/JobSystemThreadPool.cpp

Probably because in that case, CMAKE_SYSTEM_PROCESSOR is aarch64 (Apple M1), but the target arch is x86.

it looks like detecting the target arch is not easy with cmake: https://stackoverflow.com/questions/11944060/how-to-detect-target-architecture-using-cmake

Edit : I see your fix targets only debug compilation, but here it's compiling using release

from joltphysics.

jrouwe avatar jrouwe commented on June 24, 2024

Cross compiling is indeed difficult using cmake.

Can you place:

MESSAGE(${CMAKE_SYSTEM_PROCESSOR})

at the end of CMakeLists.txt to see what it is reporting? I'm hoping it's not aarch64, that would be really bad.

from joltphysics.

SirLynix avatar SirLynix commented on June 24, 2024

Sure.

I added

MESSAGE(CMAKE_SYSTEM_PROCESSOR="${CMAKE_SYSTEM_PROCESSOR}")

and the output is:

CMAKE_SYSTEM_PROCESSOR=""

😓

https://github.com/xmake-io/xmake-repo/actions/runs/9489680619/job/26151521672?pr=4330#step:5:478

from joltphysics.

jrouwe avatar jrouwe commented on June 24, 2024

Ok, that was unexpected! I'm not really sure how to proceed now.

I can probably blanket disable the always_inline warning on gcc, but I can't really add -msse if I don't know what processor we're compiling for.

Do we really need to cross compile from Apple M1 to a 32-bit x86 platform? (seems like a super fringe case)

from joltphysics.

SirLynix avatar SirLynix commented on June 24, 2024

Do we really need to cross compile from Apple M1 to a 32-bit x86 platform? (seems like a super fringe case)

The problem is not 32-bit but target architecture detection, as CMAKE_SYSTEM_PROCESSOR is empty on MinGW 64bits on macOS too, and on MinGW 64bits on Windows too (I'm not sure why the error doesn't trigger).

Notice that MinGW has no SSE option even on Windows when targeting 64bits

/D/a/_temp/msys64/mingw64/bin/x86_64-w64-mingw32-g++.exe -DJPH_BUILD_SHARED_LIBRARY -DJPH_DEBUG_RENDERER -DJPH_OBJECT_LAYER_BITS=16 -DJPH_PROFILE_ENABLED -DJPH_SHARED_LIBRARY -DJolt_EXPORTS -DNDEBUG @CMakeFiles/Jolt.dir/includes_CXX.rsp -m64 -Wno-stringop-overflow -ffp-contract=off -m64 -O3 -DNDEBUG -std=c++17 -Winvalid-pch -include C:/Users/runneradmin/AppData/Local/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Build/build_07cbf58e/CMakeFiles/Jolt.dir/cmake_pch.hxx -MD -MT CMakeFiles/Jolt.dir/4560404b44facb1d4119b9286f48a459/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Physics/Body/MotionProperties.cpp.obj -MF CMakeFiles/Jolt.dir/4560404b44facb1d4119b9286f48a459/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Physics/Body/MotionProperties.cpp.obj.d -o CMakeFiles/Jolt.dir/4560404b44facb1d4119b9286f48a459/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Physics/Body/MotionProperties.cpp.obj -c /C/Users/runneradmin/AppData/Local/.xmake/cache/packages/2406/j/joltphysics/v5.0.0/source/Jolt/Physics/Body/MotionProperties.cpp

I suppose the best option would be to detect arch from compiler, or to expect the user to set CMAKE_SYSTEM_PROCESSOR from the command-line.

I can submit a pull request to xmake to set CMAKE_SYSTEM_PROCESSOR quite easily, but that won't fix 32bits compilation as sse options are only added in the x86_64 check

from joltphysics.

SirLynix avatar SirLynix commented on June 24, 2024

After some more research, it appears CMAKE_SYSTEM_PROCESSOR is empty because we set CMAKE_SYSTEM_NAME, which enables cross-compilation. It's our duty to set CMAKE_SYSTEM_PROCESSOR.
(edit: interesting read)

I will fix that on xmake side, and I think a fix is needed here:

JoltPhysics/Jolt/Jolt.cmake

Lines 619 to 655 in 4cd5205

elseif ("${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "x86_64" OR "${CMAKE_SYSTEM_PROCESSOR}" STREQUAL "AMD64")
# x64
if (USE_AVX512)
target_compile_options(Jolt PUBLIC -mavx512f -mavx512vl -mavx512dq -mavx2 -mbmi -mpopcnt -mlzcnt -mf16c)
elseif (USE_AVX2)
target_compile_options(Jolt PUBLIC -mavx2 -mbmi -mpopcnt -mlzcnt -mf16c)
elseif (USE_AVX)
target_compile_options(Jolt PUBLIC -mavx -mpopcnt)
elseif (USE_SSE4_2)
target_compile_options(Jolt PUBLIC -msse4.2 -mpopcnt)
elseif (USE_SSE4_1)
target_compile_options(Jolt PUBLIC -msse4.1)
else()
target_compile_options(Jolt PUBLIC -msse2)
endif()
if (USE_LZCNT)
target_compile_options(Jolt PUBLIC -mlzcnt)
endif()
if (USE_TZCNT)
target_compile_options(Jolt PUBLIC -mbmi)
endif()
if (USE_F16C)
target_compile_options(Jolt PUBLIC -mf16c)
endif()
if (USE_FMADD AND NOT CROSS_PLATFORM_DETERMINISTIC)
target_compile_options(Jolt PUBLIC -mfma)
endif()
# On 32-bit builds we need to default to using SSE instructions, the x87 FPU instructions have higher intermediate precision
# which will cause problems in the collision detection code (the effect is similar to leaving FMA on, search for
# JPH_PRECISE_MATH_ON for the locations where this is a problem).
if (NOT MSVC)
target_compile_options(Jolt PUBLIC -mfpmath=sse)
endif()
EMIT_X86_INSTRUCTION_SET_DEFINITIONS()
endif()

maybe just check if CMAKE_SYSTEM_PROCESSOR equals x86 too (and update the 32bits comment), as I don't think any of this is specific for 64bits.

What do you think?

from joltphysics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.