Comments (6)
Thanks for the quick fix!
I'll be honest, I implemented this part of my research a while back and I'm struggling to remember/find my notes on the reason why I'm using the ContinuousTransformerWrapper
over another. I implemented this before 1.23.5
.
I think it's primarily because I've embedded things outside and I needed to provide multiple things through into the transformer that's instantiated with custom_layers
within the attn_layers
?
from x-transformers.
Only a little for my needs (through wrappers and such). But being able to create modules and stacks and combine the various possible features into some frankenstein through compositionality would be brilliant and is something I have personally wished for! But I get that refactoring this repo into that would probably be a massive undertaking.
If you're ever open to going that far, I'd be happy to help contribute!
from x-transformers.
@amitkparekh no not too hard, i've been thinking about it for a while, and it may take less than 100 loc to execute for the 80% use case
will keep you updated!
from x-transformers.
Let me know if you need a second pair of hands! I feel like this repo is like allennlp but for transformer architectures, so anything that makes it that easier to build custom but robust models would be amazing 🤩
from x-transformers.
@amitkparekh hey Amit! yes you are right and i put in a fix
that's really awesome you are using an undocumented feature! are you doing a weight tied transformer?
from x-transformers.
@amitkparekh no problem!
oh i see, you are using a custom ordering of the layers, not the weight tying feature
did you hack the code so that it accepts custom modules from outside the repo? i was about to get around to that
from x-transformers.
Related Issues (20)
- kv cache breaks generation HOT 5
- Question: How to load model trained on earlier version of x-transformers HOT 3
- Enhancement: Multi Input/Output transformers HOT 1
- XL-recurrence with RotaryEmbedding and mems not working correctly. HOT 34
- Removed biases breaks pre-trained models HOT 5
- Seq len missing in rotary embedding HOT 3
- Adding memmask to ContinuousTransformerWrapper HOT 3
- attn_num_mem_kv > 0 and attn_one_kv_head = True error HOT 8
- Question: How to implement rel_pos_bias in cross_attention? HOT 13
- How to build optimizer HOT 9
- [Minor; noob question] Uniform distribution instead of normal
- RotaryEmbedding XPOS doesn't work with mems HOT 5
- Generation for PaLI?
- Confusion about image->caption example HOT 1
- How can I add custom attention masks to a Decoder? HOT 3
- Question: rotary embeddings and bad length extrapolation HOT 1
- [Bug] XL-recurrence with AlibiPositionalBias and mems not working correctly HOT 17
- [Question] very small attention scores HOT 7
- Was it a clerical error ? ScaleNorm.g init form dim ** -0.5. I think it should be dim ** 0.5 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from x-transformers.