GithubHelp home page GithubHelp logo

multivariate? about patchtst HOT 9 CLOSED

kashif avatar kashif commented on July 21, 2024 2
multivariate?

from patchtst.

Comments (9)

kashif avatar kashif commented on July 21, 2024 3

the information of one time series affects the prediction in the sense that you have a single model trained over all the time series... just like an image classification model is trained over a lot of independent images by giving the images in the batch dim:

https://github.com/yuqinie98/PatchTST/blob/main/PatchTST_supervised/layers/PatchTST_backbone.py#L164

At inference time the predictions are also made independently of each time series in PatchTST. I.e. the emissions are independent too, but the learned weights used to make the predictions come from looking over all the time series:

https://github.com/yuqinie98/PatchTST/blob/main/PatchTST_supervised/layers/PatchTST_backbone.py#LL169

So again, the model will implicitly learn the relationships of the different time series (i.e. spatial-temporal relationships) since it's a shared (or global model) but the outputs will be independent and they are reshaped/concated as in the diagram 1(a).

I have also implemented a probabilistic patchTST here: awslabs/gluonts#2748 with a few features not in the original model.

from patchtst.

lesego94 avatar lesego94 commented on July 21, 2024 1

Guys.. I am still trying to understand. SImple question. Does the information of one time series affect the prediction of another time series when using PatchTST. Basically I'm looking for spatio-temporal capabilities. Similar to space-time-former. @kashif ?
image

Bascially what I'm working on, is improving forecasts by including other time time-series that can provide additional information.

from patchtst.

namctin avatar namctin commented on July 21, 2024

As mentioned in the paper, we provide another framework for multivariate time series forecasting/regression/classification where channels can go into the Transformer in parallel instead of mixing. We use vanilla Transformer to demonstrate, but that does not mean we need to only stick with that. You can come up with any other architecture that can exploit the correlation structure in your data. We will soon put another code for that. The computation and memory is not quadratic with M but linear.

Thank you for the question!

from patchtst.

kashif avatar kashif commented on July 21, 2024

thank you for your quick response.... I believe the issue is exactly that the channels (variates) are going in parallel as you say, in the batch dimensions, and then afterward reshaped back to the multivariate dimension that makes PatchTST a univariate model and the source of my confusion when reading your paper.

https://github.com/yuqinie98/PatchTST/blob/main/PatchTST_supervised/layers/PatchTST_backbone.py#L164

Just because during training you give the model batches with all the variates and during inference you predict all the variates in parallel does not make the model a multivariate model. In-fact at inference time all deep-learning based univariate models predict the variates in appropriately sized batches.

The recent TSMixer paper from google which cites PatchTST also puts this model in the univariate taxonomy.

Also note that the metrics you report in Table 3 are not quite correct, since for example, MAE is in the unit of the data and if you check the datasets you consider, Traffic is the only one which is in the range [0. 1]. I believe you are reporting the NMAE and NMSE.

from patchtst.

yuqinie98 avatar yuqinie98 commented on July 21, 2024

Hello,

First, as we mentioned in paper, we are dealing with "multivariate time series forecasting task". And the way we want to solve it is to do "channel-independence". You can also refer to this paper https://arxiv.org/pdf/2205.13504.pdf and see the usage of "multivariate", as they also use a "channel-independence" structure. The TSMixer paper you mentioned categorizes PatchTST as "multivariate input" and "multivariate output", which is just the thing that we have claimed in paper. So the term is not misused.

Second, for the metrics we are reporting the results with normalized data, to make it consistent with the result tables in all the previous baseline papers, such as https://github.com/cure-lab/LTSF-Linear, https://github.com/zhouhaoyi/Informer2020, https://github.com/thuml/Autoformer, https://github.com/MAZiqing/FEDformer.

Thanks again for your attention and comments on our work!

from patchtst.

lesego94 avatar lesego94 commented on July 21, 2024

Hi @kashif I am working on a master's thesis and was also concerned about channel independence as well. Thank you both for Clarifying this issue. Kashif, could you suggest an alternative model that I could use? the TSMixer paper is great but they haven't made their code available. Thank you!

from patchtst.

kashif avatar kashif commented on July 21, 2024

sorry for the late reply @yuqinie98 and @namctin I went over your paper again and realize that you are referring to the dataset and task, however reading the paper gave me the impression that the model was multivariate... the model is univariate and in fact, all univariate models can be used for the "multivariate forecasting" task by just predicting each variate independently (which you call "channel independence").

The tsmixer's table you refer to refers to the fact that the model is able to take a multivariate input and output (which is the property as mentioned above of any univariate model since it can take the variates in the batch dimension and predict the multivariate vector independently as done here). I was more referring to

Screenshot 2023-03-27 at 19 46 42
and
Screenshot 2023-03-27 at 19 47 25
which was the source of my confusion which is cleared I suppose.

Also, I apologize for baiting you with the metric remark (which is valid) since I knew you would say that it's from the works you mentioned which I suppose was my point that bad conventions or confusion of notations spread to the point now where anyone reading your paper or others with similar comparisons will not know that the data is actually standardized and the metrics are over the standardized test set (ie NMAE and NMSE). Your paper I believe does not mention this fact and neither do a number of others. Anyways I will close this issue.

@lesego94 go with PatchTST, I had no technical concerns, I was just confused.

from patchtst.

kashif avatar kashif commented on July 21, 2024

ah sorry, you mention it in appendix B1! My bad I'll stop embarrassing myself for one day! Ah my bad x2 I had the TSmixer paper open...

from patchtst.

lesego94 avatar lesego94 commented on July 21, 2024

Thank you! ill check out your modification aswell.

from patchtst.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.