GithubHelp home page GithubHelp logo

Comments (12)

PidgeyBE avatar PidgeyBE commented on July 20, 2024

Could it be related to this?
https://github.com/mitsuba-renderer/mitsuba2/blob/dbcecba782a228fcda134558f9ae57fa91033967/resources/ptx/Makefile#L4

This seems to specify the compute capability of the GPU and 61 matches the P4 indeed... https://en.wikipedia.org/wiki/CUDA

Would it be possible to support more types?
http://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/

from enoki.

wjakob avatar wjakob commented on July 20, 2024

Hi,

you need a Maxwell-class GPU or newer to run Mitsuba's inverse renderer (e.g. GeForce 1080). The bottleneck here is actually not Enoki (it could likely compile and run with a much lower compute capability), but OptiX which requires Maxwell/Turing for the "RTGeometryTriangles" primitive that we depend on.

Best,
Wenzel

from enoki.

PidgeyBE avatar PidgeyBE commented on July 20, 2024

Hi Wenzel

Glad to receive a reaction from the legend himself!
I was reading this page: http://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/

When you compile CUDA code, you should always compile only one ‘-arch‘ flag that matches your most used GPU cards. This will enable faster runtime, because code generation will occur during compilation.
If you only mention ‘-gencode‘, but omit the ‘-arch‘ flag, the GPU code generation will occur on the JIT compiler by the CUDA driver.

and was wondering if this script: https://github.com/mitsuba-renderer/mitsuba2/blob/dbcecba782a228fcda134558f9ae57fa91033967/resources/ptx/Makefile#L4 shouldn't contain the -arch flag to make it more performant?

from enoki.

wjakob avatar wjakob commented on July 20, 2024

We compile to PTX (compute_..) instead of specific device code (sm_..) because this enables the resulting shared library to be moved between systems that potentially have different GPUs. The downside is minimal, a few hundred milliseconds for JIT compilation the first time that Enoki is used. (The resulting native code is cached in ~/.nv in your home directory)

from enoki.

PidgeyBE avatar PidgeyBE commented on July 20, 2024

Aha ok, thanks for the explanation! You say a Maxwell class GPU (or newer) is needed for Optix, but all the GPU's I tested (except K80) are Maxwell or better...

I have the feeling that the code only runs on a GPU with compute capability 61, as the P4 or 1080...

from enoki.

wjakob avatar wjakob commented on July 20, 2024

Aha, that could well be -- maybe we're setting the flag too strictly. Can you try just manually setting it to something smaller? What is the C.C. of the other maxwell devices?

from enoki.

PidgeyBE avatar PidgeyBE commented on July 20, 2024

I've updated the table in my initial post with the C.C's.
I've changed https://github.com/mitsuba-renderer/mitsuba2/blob/dbcecba782a228fcda134558f9ae57fa91033967/resources/ptx/Makefile this file (61 -> 50) and did

make clean
make all

Now I'm rebuilding mitsuba2 on my laptop (GeForce 940M, Maxwell, with CC=5.0) 🤞
I know this hardware is not ideal, but I just want to do some basic tests at this point...

from enoki.

PidgeyBE avatar PidgeyBE commented on July 20, 2024

Still exactly the same error...

PTX linker error:
ptxas fatal : SM version specified by .target is higher than default SM version assumed
cuda_check(): driver API error = 0400 "CUDA_ERROR_INVALID_HANDLE" in ../ext/enoki/src/cuda/jit.cu:253.

I'm wondering where the "SM version" is still defined....

from enoki.

merlinND avatar merlinND commented on July 20, 2024

In CMake, there's a variable called ENOKI_CUDA_COMPUTE_CAPABILITY which you could try and set to match your change in the Makefile you pointed out:

set(ENOKI_CUDA_COMPUTE_CAPABILITY "61" CACHE STRING "Compute capability as specified by https://developer.nvidia.com/cuda-gpus")

from enoki.

PidgeyBE avatar PidgeyBE commented on July 20, 2024

Hey Merlin
Thanks! That did the job (together with the other fix)!
I can now run the cbox example on my laptop's Geforce 940M! :)

Should I create a PR for this, or do you guys prefer keeping it as is?

from enoki.

wjakob avatar wjakob commented on July 20, 2024

Yes, that would be great -- please make a PR that downgrades the compute capabilities to the minimal version known to work.

from enoki.

PidgeyBE avatar PidgeyBE commented on July 20, 2024

mitsuba-renderer/mitsuba2#53
#71

from enoki.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.