Comments (3)
Hello,
Thank you !
Yes, actually the discretization used here is the same used in the official Mamba implementation. They differ from what is described in the paper. They replace the expression from the paper by a simpler one (first order approx).
More details are found in this issue and this one.
Concerning getting rid of the discretization step, it is something possible but I guess the authors wanted to stick to what was traditional in the SSM setup so they kept it. So that yes, they could for example reuse the same initialization used in other SSMs.
Hope this helps
from mamba.py.
Hi @alxndrTL ,
Thanks a lot. That's very helpful. I think the authors should clarify this somewhere, since some people seem to be confused by it.
from mamba.py.
Hi @alxndrTL ,
Thanks a lot. That's very helpful. I think the authors should clarify this somewhere, since some people seem to be confused by it.
That is also what I confused. Thanks for clarifying this.
from mamba.py.
Related Issues (20)
- Working on AMD ROCm Platform HOT 1
- Please rectify paths in example_e2e_training.ipynb
- delta question HOT 5
- scan output is different between sequential and parallel versions HOT 3
- Changing th pscan from in-place to out-of-place? HOT 3
- [Feature Request] VideoMamba HOT 4
- Pscan documentation HOT 1
- Cuda Version HOT 5
- support non-zero H[0] inputs HOT 4
- Question on using a sequence length > the max length I can hold in a batch due to memory usage for training HOT 2
- Error in the Mamba Block forward function? HOT 3
- Parallel pscan HOT 2
- Partial batches in Mamba_lm HOT 1
- Default implementation of Jamba HOT 2
- Can I translate your PScan in Jax? HOT 2
- Up sweep in parallel scan HOT 1
- MuP HOT 4
- flops about mamba2 HOT 2
- How to use cache in mamba2? HOT 2
- Values of deltaA are very large HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mamba.py.