muditbhargava66 / pyxlstm Goto Github PK

Efficient Python library for Extended LSTM with exponential gating, memory mixing, and matrix memory for superior sequence modeling.

Home Page: https://pyxlstm.readthedocs.io/

License: MIT License

Python 100.00%

language-modeling lstm sequence-modeling xlstm

pyxlstm's Introduction

Hi there, I'm Mudit Bhargava! 👋

Connect with Me

LinkedIn | Twitter | Personal Website | Email

Feel free to explore my repositories and contributions below!

🔬 My Expertise

                   Computer Architecture
                 & Hardware Design
           ┌─────────────────────────────┐
           │                             │
           │  ┌───────────────────────┐  │
           │  │                       │  │
     ┌─────┼──┼─────┐           ┌─────┼──┼─────┐
     │     │  │     │           │     │  │     │
     │     │  │     │           │     │  │     │
     │  High-Performance        Communication  │
     │  Computing &              Systems &     │
     │  Optimization             Protocols     │
     │     │  │     │           │     │  │     │
     │     │  │     │           │     │  │     │
     └─────┼──┼─────┘           └─────┼──┼─────┘
           │  │        Machine        │  │
           │  │       Learning        │  │
           │  │         & AI          │  │
           │  │                       │  │
           │  └───────────────────────┘  │
           │                             │
           └─────────────────────────────┘

🌞 Morning                 0 tasks        ░░░░░░░░░░░░░░░░░░░░░░░░░    0 % 
🌆 Daytime                20 tasks        █████░░░░░░░░░░░░░░░░░░░░   20 %
🌃 Evening                40 tasks        ████████████░░░░░░░░░░░░░   40 %
🌙 Night                  40 tasks        ████████████░░░░░░░░░░░░░   40 %

pyxlstm's People

Contributors

Stargazers

Watchers

Forkers

alifa98 wang-chbo jackzhousz miknoj rohankumawat mw66 strawl ulamaca junghaolin emily1314 shruu27 liujike shashiniyer ethersito123 lengyanyanjing fengxinlee sueheck ltgbao04 marilyn0321

pyxlstm's Issues

Please mention the dataset for the examples

Great job on the xLSTM repo, Mudit! It would be really helpful if you could include a sample dataset downloader to make running your examples easier.

Make embedding/vocab optional for timeseries data

Current implementation seems to require an embedding layer. This is inconvenient for models training on floating-point data.

Fix sLSTM and mLSTM to work with 1 layer

sLSTM and mLSTM have dropout layers with length of 1 less than other layers:

self.dropout_layers = nn.ModuleList([nn.Dropout(dropout) for _ in range(num_layers - 1)])

So if you will try to run models with only one layer, no layers will be applied to input because of this for:

for i, (lstm, dropout, f_gate, i_gate) in enumerate(zip(self.lstms, self.dropout_layers, self.exp_forget_gates, self.exp_input_gates))

Raster Example implementation

Hi every body
Does it possible to use mLSTM for time series of raster datasets?

unit tests as per README fail

The unit tests as per README.md fail with this installation procedure:

$ cat environment.yml 
name: xlstm
channels:
  - pytorch
  - nvidia
  - conda-forge
  - defaults
dependencies:
  - python
  - pip
$ mamba env create -f environment.yml
$ pip install -r requirements.txt

On Ubuntu 22.04.4 LTS
With NVIDIA Server Driver metapackage from nvidia-driver-535-server (proprietary)

Then, as per this closed issue regarding setup.py:

$ pip install .
Successfully installed PyxLSTM-1.0.1
$ python -m unittest discover tests
EEEEEEEEE
======================================================================
ERROR: test_backward_pass (test_block.TestXLSTMBlock.test_backward_pass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_block.py", line 37, in test_backward_pass
    output_seq, _ = xlstm_block(input_seq)
                    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/block.py", line 55, in forward
    lstm_output, hidden_state = self.lstm(input_seq, hidden_state)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/slstm.py", line 53, in forward
    if i < self.num_layers - 1:
       ^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

======================================================================
ERROR: test_forward_pass_mlstm (test_block.TestXLSTMBlock.test_forward_pass_mlstm)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_block.py", line 27, in test_forward_pass_mlstm
    output_seq, hidden_state = xlstm_block(input_seq)
                               ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/block.py", line 55, in forward
    lstm_output, hidden_state = self.lstm(input_seq, hidden_state)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/mlstm.py", line 68, in forward
    C_t = f * C + i * torch.matmul(values.unsqueeze(2), keys.unsqueeze(1)).squeeze(1)
          ~~^~~
RuntimeError: The size of tensor a (4) must match the size of tensor b (64) at non-singleton dimension 1

======================================================================
ERROR: test_forward_pass_slstm (test_block.TestXLSTMBlock.test_forward_pass_slstm)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_block.py", line 17, in test_forward_pass_slstm
    output_seq, hidden_state = xlstm_block(input_seq)
                               ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/block.py", line 55, in forward
    lstm_output, hidden_state = self.lstm(input_seq, hidden_state)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/slstm.py", line 53, in forward
    if i < self.num_layers - 1:
       ^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

======================================================================
ERROR: test_backward_pass (test_mlstm.TestMLSTM.test_backward_pass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_mlstm.py", line 27, in test_backward_pass
    output_seq, _ = mlstm(input_seq)
                    ^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/mlstm.py", line 68, in forward
    C_t = f * C + i * torch.matmul(values.unsqueeze(2), keys.unsqueeze(1)).squeeze(1)
          ~~^~~
RuntimeError: The size of tensor a (4) must match the size of tensor b (64) at non-singleton dimension 1

======================================================================
ERROR: test_forward_pass (test_mlstm.TestMLSTM.test_forward_pass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_mlstm.py", line 17, in test_forward_pass
    output_seq, hidden_state = mlstm(input_seq)
                               ^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/mlstm.py", line 68, in forward
    C_t = f * C + i * torch.matmul(values.unsqueeze(2), keys.unsqueeze(1)).squeeze(1)
          ~~^~~
RuntimeError: The size of tensor a (4) must match the size of tensor b (64) at non-singleton dimension 1

======================================================================
ERROR: test_backward_pass (test_model.TestXLSTMModel.test_backward_pass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_model.py", line 29, in test_backward_pass
    output_seq, _ = xlstm_model(input_seq)
                    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/model.py", line 39, in forward
    output_seq, hidden_state = block(output_seq, hidden_states[i])
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/block.py", line 55, in forward
    lstm_output, hidden_state = self.lstm(input_seq, hidden_state)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/slstm.py", line 53, in forward
    if i < self.num_layers - 1:
       ^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

======================================================================
ERROR: test_forward_pass (test_model.TestXLSTMModel.test_forward_pass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_model.py", line 20, in test_forward_pass
    output_seq, hidden_states = xlstm_model(input_seq)
                                ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/model.py", line 39, in forward
    output_seq, hidden_state = block(output_seq, hidden_states[i])
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/block.py", line 55, in forward
    lstm_output, hidden_state = self.lstm(input_seq, hidden_state)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/slstm.py", line 53, in forward
    if i < self.num_layers - 1:
       ^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

======================================================================
ERROR: test_backward_pass (test_slstm.TestSLSTM.test_backward_pass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_slstm.py", line 27, in test_backward_pass
    output_seq, _ = slstm(input_seq)
                    ^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/slstm.py", line 53, in forward
    if i < self.num_layers - 1:
       ^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

======================================================================
ERROR: test_forward_pass (test_slstm.TestSLSTM.test_forward_pass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jabowery/devel/xlstm/tests/test_slstm.py", line 17, in test_forward_pass
    output_seq, hidden_state = slstm(input_seq)
                               ^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/mambaforge/envs/xlstm/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jabowery/devel/xlstm/xLSTM/slstm.py", line 53, in forward
    if i < self.num_layers - 1:
       ^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

----------------------------------------------------------------------
Ran 9 tests in 0.044s

FAILED (errors=9)

distinguish the index `i` and the output `i` of `i_gate` in sLSTM

ModuleNotFoundError: No module named 'xLSTM.data'

After:

$ pip install .

$ python3 examples/language_modeling.py 
...
ModuleNotFoundError: No module named 'xLSTM.data'

"setup.py" file doesn't exist

Manual installation fails as there is no "setup.py" file in the repo.

RuntimeError: Boolean value of Tensor with more than one value is ambiguous in xLSTM/slstm.py

Issue Description

I'm encountering a RuntimeError when attempting to execute a backward pass in my xLSTM model. The error message is:

RuntimeError: Boolean value of Tensor with more than one value is ambiguous

This issue occurs in the slstm.py module, specifically at line 53 during a forward pass. Here's the relevant code snippet:

if i < self.num_layers - 1:
    # Further operations

It seems the variable i, which is expected to be an integer index, is somehow being interpreted as a tensor, leading to an ambiguous boolean condition in the if statement. This happens during the operation lstm_output, hidden_state = self.lstm(input_seq, hidden_state), suggesting that there might be an issue with how the LSTM output or hidden state is handled or initialized.

Request for Help
Could someone help clarify what might be going wrong here or suggest modifications to avoid this issue? Any guidance would be greatly appreciated.

Thank you!

ModuleList

for gate in self.exp_forget_gates + self.exp_input_gates:
TypeError: unsupported operand type(s) for +: 'ModuleList' and 'ModuleList'

The Equations of sLSTM is wrong?

The Equations in docs/slstm.md maybe wrong, see figure:

while in original paper, it should be :

I am getting nan values in the mlstm output

Can't we use both slstm and mlstm blocks in the model.

Hidden layer state output error found when utilizing mlstm

I was running xLSTM_shape_verification.py with lstm_type changed to mlstm and found that the hidden layer output was incorrectly output as:
AttributeError: 'tuple' object has no attribute 'shape'
Output sequence shape: torch.Size([4, 10, 10000])
Hidden states shapes:

When running language_modeling.py with an xlstm composed of mlstm, the loss is almost always 0.

May be wrong code

PyxLSTM/xLSTM/mlstm.py

Line 89 in a34c252

 C = torch.zeros(batch_size, self.hidden_size, self.hidden_size, device=lstm.weight_ih.device) 

This code will make a wrong in:

PyxLSTM/xLSTM/mlstm.py

Line 68 in a34c252

 C_t = f * C + i * torch.matmul(values.unsqueeze(2), keys.unsqueeze(1)).squeeze(1) 

RuntimeError: The size of tensor a (2534) must match the size of tensor b (256) at non-singleton dimension 1

Can you please check it?

A bug in mLSTM.py

Hi,
I appreciate that you provided a good repo for my research.

But there is a bug in xLSTM/mLSTM.py. The variable 'i' in Line 59 is conflicting with the one in Line 65.

Best regards

Stabilizer state missing from sLSTM

The new sLSTM doesn't have the stabilizer state m. This leads to exploding gradients very easily.

jhlk

NaNs in testing

Issue

Getting NaNs on backward propoagation.

Trace

this was the call1
this was the call12
Batch loss was for batch number {batch_idx}: tensor(9.6656, device='cuda:0', grad_fn=)
this was the call13
NaN detected in gradient of embedding.weight
NaN detected in parameter embedding.weight after update
this was the call1
this was the call12
NaN detected in model output
Epoch 1/1, Average Loss: 0.0863
Training completed! Total time: 2.32 seconds

Code

    print("this was the call1")
    if check_nan(input_seq, "input_seq"):
        break
    
    output, _ = model(input_seq)
    
    print("this was the call12")
    if check_nan(output, "model output"):
        break
    
    output = output.contiguous().view(-1, len(loader_object.vocab))
    target_seq = target_seq.contiguous().view(-1)
    
    loss = criterion(output, target_seq)
    print("Batch loss was for batch number {batch_idx}: ", loss)

    
    print("this was the call13")
    if check_nan(loss, "loss"):
        break

Dataset Used

gwlms/dewiki-20230701-flair-corpus

Possible things to look at

Note: I am using the language_model.py code provided in the repo with the only change being the dataset I am using.

I suspect I might be dealing with exploding gradients. The error "NaN detected in gradient of embedding.weight" is a big clue here. I'm thinking my gradients are probably getting too large during backpropagation.
I'm a bit concerned about my loss function. That loss value of 9.6656 seems pretty high to me (the starting with the random dataset you provided was around 7).

Purpose of Ticket

ruling out the possibility that there might be anything wrong with the way XLSTM has been implemented, additionally would love if you would like to collaborate on getting your implementation to work with a large dataset.

Is pip correct?

Hi,

Sorry, but did you put the package on pip? cuz it seems we do no have such thing 😞

I wanted to try this but:

ERROR: Could not find a version that satisfies the requirement PyxLSTM (from versions: none)
ERROR: No matching distribution found for PyxLSTM

can this model be used for image classification tasks?

like training the times series data, then inference the classification result.

Problem with input gates in sLSTM

In the realization of sLSTM the cell state $c_t$ is updated like this:

c = f * c + i * lstm.weight_hh.new_zeros(batch_size, self.hidden_size)

So because of .new_zeros() input gate makes no sense

RuntimeError: input has inconsistent input_size: got 8 expected 16

I was testing the module using this code :

from xLSTM.model import xLSTM
import torch
model = xLSTM(5, 8, 16, 5, 2, 0.1, True, 'slstm')
inputs = torch.randint(low=0, high=5,size=(12,15000))
outputs = model(inputs)

when I set number of heads > 1 I get this error :

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/d/pyxLSTM/xLSTM/model.py", line 31, in forward
    output_seq, hidden_state = block(output_seq, hidden_states[i])
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/d/pyxLSTM/xLSTM/block.py", line 55, in forward
    lstm_output, hidden_state = self.lstm(input_seq, hidden_state)
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/d/pyxLSTM/xLSTM/slstm.py", line 47, in forward
    h, c = lstm(x, (hidden_state[a][0], hidden_state[a][1]))
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/bkffadia/.local/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 1347, in forward
    ret = _VF.lstm_cell(
RuntimeError: input has inconsistent input_size: got 8 expected 16

Unable to find the get_device implementation

The example given in the README.md file had a import statement that said:

from xLSTM.utils import load_config, set_seed, get_device

I am unable to find the implementation for the get_device function, could you please clarify this?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble