Comments (3)
@DerEchteFeuerpfeil hey Moritz! so i had the same confusion as you starting out in the field with how the mask is represented
so i decided i would not perpetuate this confusion. for all my repos, without exception, masking is always True
for attend and False
for not. this makes sense to me because if one were to cast it to a float, one can also multiply it with your input for effectively masking in non-attention scenarios
in your example, it should work if you just invert your mask with a ~
, but let me know if it does not
from x-transformers.
Hey @lucidrains , thanks for the quick reply!
Looks like I was so close to getting it right 😉 This implementation also makes the most sense to me.
Works like a charm now 👍
from x-transformers.
@DerEchteFeuerpfeil nice! go train something amazing :)
from x-transformers.
Related Issues (20)
- kv cache breaks generation HOT 5
- Question: How to load model trained on earlier version of x-transformers HOT 3
- Enhancement: Multi Input/Output transformers HOT 1
- XL-recurrence with RotaryEmbedding and mems not working correctly. HOT 34
- Removed biases breaks pre-trained models HOT 5
- Seq len missing in rotary embedding HOT 3
- Adding memmask to ContinuousTransformerWrapper HOT 3
- attn_num_mem_kv > 0 and attn_one_kv_head = True error HOT 8
- Question: How to implement rel_pos_bias in cross_attention? HOT 13
- How to build optimizer HOT 9
- [Minor; noob question] Uniform distribution instead of normal
- RotaryEmbedding XPOS doesn't work with mems HOT 5
- `layer_mem` is unbound (when called from `ContinuousTransformerWrapper`) HOT 6
- Generation for PaLI?
- Confusion about image->caption example HOT 1
- Question: rotary embeddings and bad length extrapolation HOT 1
- [Bug] XL-recurrence with AlibiPositionalBias and mems not working correctly HOT 17
- [Question] very small attention scores HOT 7
- Was it a clerical error ? ScaleNorm.g init form dim ** -0.5. I think it should be dim ** 0.5 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from x-transformers.