GithubHelp home page GithubHelp logo

asformer's People

Contributors

chinayi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

asformer's Issues

Long training time

Hello,

I am adapting your code for my own dataset which usually train relatively fast when using only ASRF, but when using your model with the transformer it's taking approximately 10x times longer. Do you have a similar behaviour with Salad/breakfast/gtea datasets ?

Thank you :)

Increase the batchsize and the result is hurt

Hi,
Thank you for your work.
When I try to increase the batch size, the index drops a lot. What do you think are the possible reasons

the GPU is A100 40g.

train with default setting, only split 1

(s1)[83.40807175 81.16591928 72.19730942] 75.934108 83.2241

just change batch size to 8 lr 0.001 and then train

(s1)[68.94977169 67.57990868 55.25114155] 63.931922 72.0049

Feature Extraction

Hi, can you provide more informations about the feature extraction? I would like to use this fantastic model on my dataset but I don't know how to extract the features to feed to the encoder.

Enviroment issues

I installed the environment as you asked: Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1

It is certain that the model is loaded because the model size is printed:
Model Size: 1130860

But the problem is:
Traceback (most recent call last):
File "main.py", line 99, in
trainer.predict(model_dir, results_dir, features_path, batch_gen_tst, num_epochs, actions_dict, sample_rate)
File "/home/cpslabrtx3090/zjb/projects/ASFormer/model.py", line 399, in predict
self.model.load_state_dict(torch.load(model_dir + "/epoch-" + str(epoch) + ".model"))
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 560, in _load
raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError: ./models/gtea/split_1/epoch-120.model is a zip archive (did you mean to use torch.jit.load()?)

Traceback (most recent call last):
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 189, in nti
n = int(s.strip() or "0", 8)
ValueError: invalid literal for int() with base 8: 'ld_tenso'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 2299, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1093, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1035, in frombuf
chksum = nti(buf[148:156])
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 191, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 556, in _load
return legacy_load(f)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/site-packages/torch/serialization.py", line 467, in legacy_load
with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1591, in open
return func(name, filemode, fileobj, **kwargs)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1621, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 1484, in init
self.firstmember = self.next()
File "/home/cpslabrtx3090/anaconda3/envs/ASFormer/lib/python3.6/tarfile.py", line 2311, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header

about the randomness of code

Hi, Thank you for your code

if (epoch + 1) % 10 == 0 and batch_gen_tst is not None:

When I change the test interval from 10 to 20, 30, etc., different results(such as training loss ) are obtained under the same seed. What do you think is the reason?
best regards

attention实现的问题

您好,您提到的层次注意力是不是指的是band attention(如下图所示),只不过随着层数增加,窗口大小指数递增。这样的话model.py里这个函数里的那个for循环内容,是不是应该改为window_mask[:, i, i:i+self.bl] = 1

def construct_window_mask(self):
    window_mask = torch.zeros((1, self.bl, self.bl + 2* (self.bl //2)))
    for i in range(self.bl):
        window_mask[:, :, i:i+self.bl] = 1
    return window_mask.to(device)

image

Issue while trying to run the pretained models.

I tried to run the pretrained models but i keep getting the following error:

**(myenv) E:\ASN\ASFormer>python main.py --action=predict --dataset=50salads --split=1
Model Size: 1134476
Traceback (most recent call last):
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 189, in nti
n = int(s.strip() or "0", 8)
ValueError: invalid literal for int() with base 8: 'ld_tenso'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 2297, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1093, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1035, in frombuf
chksum = nti(buf[148:156])
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 191, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 556, in _load
return legacy_load(f)
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 467, in legacy_load
with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1589, in open
return func(name, filemode, fileobj, **kwargs)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1619, in taropen
return cls(name, mode, fileobj, **kwargs)
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 1482, in init
self.firstmember = self.next()
File "C:\Users\talks\miniconda3\envs\myenv\lib\tarfile.py", line 2309, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 97, in
trainer.predict(model_dir, results_dir, features_path, batch_gen_tst, num_epochs, actions_dict, sample_rate)
File "E:\ASN\ASFormer\model.py", line 399, in predict
self.model.load_state_dict(torch.load(model_dir + "/epoch-" + str(epoch) + ".model"))
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 387, in load
return _load(f, map_location, pickle_module, pickle_load_args)
File "C:\Users\talks\miniconda3\envs\myenv\lib\site-packages\torch\serialization.py", line 560, in _load
raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError: ./models/50salads/split_1/epoch-120.model is a zip archive (did you mean to use torch.jit.load()?)

I tried changing the torch.load to torch.jit.load but i get another error saying that pytorch version is old to run this. I am using Python 3.6.10, PyTorch 1.1.0, torchvision 0.3.0 and i am for now just trying to run on CPU not GPU. Kindly, need your assistance related this matter. Thank you.

The provided models generate lower scores than the paper reported

Thanks for you nice work, meanwhile, may I confirm one thing? By using your features and pre-trained models (epoch=120), the obtained scores are lower than your BMVC paper for three datasets. For instance, the edit and F1@10 of gtea can only reach 84.0 and 88.9, which are lower than 84.6 and 90.1 in your paper. Same for another two datasets.
50salads edit=75.7, F1@10=83.4.

Error in evaluation code

Hi,

Thanks for sharing the code. I noticed the bg_class in the evaluation code is not properly set.

The default name of background class is set to background, which is true in GTEA yet need to be changed to SIL for breakfast and action_start and action_end for 50salads. It seems they are not changed for the results in the paper.

With the correct class name and the released model, I obtained a lower result

[email protected] [email protected] [email protected]
Breakfast 70.9 67.5 56.7
50salads 83.7 81.8 73.7

How to understand stage images from result

Hello, thank you for sharing your amazing work!.

I have a question when analysing the results.

for images generated like below
image

What does each row means? and what does each stage 0, 1, 2, 3 means?

Also, as the methods uses each frame's action label to evaluate, you might compare the model with action recognition models too. Is there any specific reason for not comparing result for action recognition?

Thank you in advance!

How to extract attention weights

Hi, I was wondering how is it possible to extract the self attention weight? As you have done in Figure 2. I am interested in the hierarchical case.

Cannot download the model

Hi !
Firstly, thanks for sharing this repo ! I'm struggling to download the model (3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg)) Indeed, the site says that you need to create an account to download the file. The thing is I cannot create an account with a french phone number 😅 Any other way to download the pretrained model ?
Many thanks !

pre-extracted feature

你好,我最近也在研究一个类似的任务,看到你的这篇文章很感兴趣,打算借鉴下。请问下,论文里提到的pre-extract feature指的是每帧图像提取出来的feature map吗,能详细阐述下吗?谢谢

results on salads50 does not match table 5

Hi thanks for your work.
I was able to train and test the model and achieve similar performance as mentioned in the paper when I use both enc and dec. However, when I don't use the decoder, the results are much worse than what is mentioned in the table 5 (first row).
I was wondering if I need to do any changes to the setting to get the same performance (specially for Acc)?
I notice that without using the decoder the acc drops lower than 80.

cross-self attention

您好,在您提供的decoder代码中,V 是来自上一个decoder或者encoder的feature, Q 和K利用的x1,即上一层的输出,是不是和论文中描述的不一样?

flops,GPU mem code

作者您好:感谢您做出的系列贡献,我想请教的是您论文部分Table9计算prams flops和GPU mem这里的代码可否方便分享一下,不知如何求得flops大小,谢谢!

Batch size constraint

Hello,

Thank you for your amazing work !

I was wondering if there is any particular reason for imposing a batch size of 1 in model.py:

assert m_batchsize == 1 # currently, we only accept input with batch size 1

In my testing, ASFormer learns fine with bigger batch sizes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.