Comments (2)
@jenkspt hey Penn! thanks for the benchmarks
i think Flash Attention takes the online softmax a step further. it is more an online softmax weighted sum. it makes the most sense in the context of CUDA, where you can control HBM access
are you working on speeding up attention for work or is this a side project?
from flash-attention-jax.
I was just messing around with it as a side project -- and thought I'd share when I saw this repo. Great work btw!
from flash-attention-jax.
Related Issues (11)
- Question about calculation of Q and transpose(K).
- Multi-head causal flash attention support? HOT 8
- Slower than non-flash attention HOT 1
- Reshape error in causal_flash_attention when sequence length is not a multiple of 1024
- can I work on making a flax attention function out of this repository? HOT 1
- batch & multihead support? HOT 3
- more general mask support HOT 1
- support for per-head scales for cosine sim attention HOT 6
- fix compatibility with jax transformations HOT 28
- Performance benchmarks? HOT 20
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flash-attention-jax.