GithubHelp home page GithubHelp logo

Comments (3)

cheshire avatar cheshire commented on August 16, 2024

T4 GPU doesn't support BF16 matmul

It actually does, but it wouldn't use TensorCores and is incredibly slow

XLA switches BF16 matmul to F32 matmul on T4

This is a fairly recent change I did, you could try to find a commit with this. Without that change, matmuls are >4x slower from what I recall (depending on shape)

If I understand correctly, "BF16" matmul should be the same performance as F32.

Why would it? T4 has neither vector nor TensorCore support for BF16, so it has to emulate it, slowly.

Or do you mean on T4? On T4, you can look at the GPU profile.

Here the problem is we use Triton for fusions, which recently dropped support for pre-Ampere GPUs (or at least they aren't officially supported). Without fusions, we need to run an extra kernel to cast from BF16 to F32, which can be as expensive as the matmul itself.

from xla.

sagelywizard avatar sagelywizard commented on August 16, 2024

Why would it?

Sorry, misspoke a bit. I meant that I'd expect that the emulation on T4 would be in the ballpark of (or at least not slower than) F32. But it sounds like it could be slower than F32 because of the extra cast?

from xla.

cheshire avatar cheshire commented on August 16, 2024

Yes. Since we support cutlass fusions I might look into supporting that fusion (cast into matmul) via cutlass.

from xla.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.