I want to congratulate you for the great patch transformer paper. I

Thanks for asking <a class="user-mention notranslate" data-hovercard-type="user" data-

Long run time? about patchtst HOT 11 CLOSED

yuqinie98 commented on August 23, 2024

Long run time?

from patchtst.

Comments (11)

ikvision commented on August 23, 2024 1

To make it clear, I didn't write this code/paper, I am like you - using it.
In the open source community it is not always easy to understand each other.
I would suggest to be kinder in order to get assistance

from patchtst.

ikvision commented on August 23, 2024

For a single step into the future, would that help
parser.add_argument('--target_points', type=int, default=1, help='forecast horizon')

The current masking in the code is random

PatchTST/PatchTST_self_supervised/src/callback/patch_mask.py

Line 113 in e66adfd

 noise = torch.rand(bs, L, nvars,device=xb.device) # noise in [0, 1], bs x L x nvars 

Do you have a casual mask pytorch implementation you are considering?

from patchtst.

Eliav2479 commented on August 23, 2024

This does not address my question.
I was talking about run time issues

from patchtst.

ikvision commented on August 23, 2024

Training time can be solved in many different ways - multi-gpu, larger batch size, faster data-loader...
Why do you think that causal mask is your main bottle neck?

from patchtst.

Eliav2479 commented on August 23, 2024

Please read the question

from patchtst.

Eliav2479 commented on August 23, 2024

When you have a window size of H and a causal mask you can predict H tokens in a single pass.

from patchtst.

ikvision commented on August 23, 2024

Indeed the methods is patch based, it might to be the best fit for predicting a single data point
You might want to to use only the pre-training with patch to create embedding.
For the second stage (fine-tunning) you can have a very simple regression from embedding predicting a single time step (1 layer NN without patches)

from patchtst.

Eliav2479 commented on August 23, 2024

I would suggest to wait for the authors for a response.
Thank you for replying.

from patchtst.

yuqinie98 commented on August 23, 2024

Thanks for asking @Eliav2479 and sorry for the late reply. Unfortunately we do not understand your question very well so we would appreciate if you could explain more of your concern. We basically agree with the solution that @ikvision proposed if you want to apply it to multiple-step prediction. Or you would just directly do multiple-step forecasting (DMS rather than IMS in this paper https://arxiv.org/pdf/2205.13504.pdf). The input is X1,...,Xt and output is Xt+1,..,Xt+T, which is done in one pass.

from patchtst.

DIKSHAAGARWAL2015 commented on August 23, 2024

any estimate on how long it will take to run supervised and self-supervised learning based on default model and params.

from patchtst.

yuqinie98 commented on August 23, 2024

It varies on different datasets, epochs, GPU... thus it would be hard to answer. The fastest one may take half an hour while the largest model takes a day. @DIKSHAAGARWAL2015

from patchtst.

Long run time? about patchtst HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs