Comments (9)
the information of one time series affects the prediction in the sense that you have a single model trained over all the time series... just like an image classification model is trained over a lot of independent images by giving the images in the batch dim:
https://github.com/yuqinie98/PatchTST/blob/main/PatchTST_supervised/layers/PatchTST_backbone.py#L164
At inference time the predictions are also made independently of each time series in PatchTST. I.e. the emissions are independent too, but the learned weights used to make the predictions come from looking over all the time series:
So again, the model will implicitly learn the relationships of the different time series (i.e. spatial-temporal relationships) since it's a shared (or global model) but the outputs will be independent and they are reshaped/concated as in the diagram 1(a).
I have also implemented a probabilistic patchTST here: awslabs/gluonts#2748 with a few features not in the original model.
from patchtst.
Guys.. I am still trying to understand. SImple question. Does the information of one time series affect the prediction of another time series when using PatchTST. Basically I'm looking for spatio-temporal capabilities. Similar to space-time-former. @kashif ?
Bascially what I'm working on, is improving forecasts by including other time time-series that can provide additional information.
from patchtst.
As mentioned in the paper, we provide another framework for multivariate time series forecasting/regression/classification where channels can go into the Transformer in parallel instead of mixing. We use vanilla Transformer to demonstrate, but that does not mean we need to only stick with that. You can come up with any other architecture that can exploit the correlation structure in your data. We will soon put another code for that. The computation and memory is not quadratic with M but linear.
Thank you for the question!
from patchtst.
thank you for your quick response.... I believe the issue is exactly that the channels (variates) are going in parallel as you say, in the batch dimensions, and then afterward reshaped back to the multivariate dimension that makes PatchTST
a univariate model and the source of my confusion when reading your paper.
https://github.com/yuqinie98/PatchTST/blob/main/PatchTST_supervised/layers/PatchTST_backbone.py#L164
Just because during training you give the model batches with all the variates and during inference you predict all the variates in parallel does not make the model a multivariate model. In-fact at inference time all deep-learning based univariate models predict the variates in appropriately sized batches.
The recent TSMixer paper from google which cites PatchTST
also puts this model in the univariate taxonomy.
Also note that the metrics you report in Table 3 are not quite correct, since for example, MAE is in the unit of the data and if you check the datasets you consider, Traffic is the only one which is in the range [0. 1]. I believe you are reporting the NMAE and NMSE.
from patchtst.
Hello,
First, as we mentioned in paper, we are dealing with "multivariate time series forecasting task". And the way we want to solve it is to do "channel-independence". You can also refer to this paper https://arxiv.org/pdf/2205.13504.pdf and see the usage of "multivariate", as they also use a "channel-independence" structure. The TSMixer paper you mentioned categorizes PatchTST as "multivariate input" and "multivariate output", which is just the thing that we have claimed in paper. So the term is not misused.
Second, for the metrics we are reporting the results with normalized data, to make it consistent with the result tables in all the previous baseline papers, such as https://github.com/cure-lab/LTSF-Linear, https://github.com/zhouhaoyi/Informer2020, https://github.com/thuml/Autoformer, https://github.com/MAZiqing/FEDformer.
Thanks again for your attention and comments on our work!
from patchtst.
Hi @kashif I am working on a master's thesis and was also concerned about channel independence as well. Thank you both for Clarifying this issue. Kashif, could you suggest an alternative model that I could use? the TSMixer paper is great but they haven't made their code available. Thank you!
from patchtst.
sorry for the late reply @yuqinie98 and @namctin I went over your paper again and realize that you are referring to the dataset and task, however reading the paper gave me the impression that the model was multivariate... the model is univariate and in fact, all univariate models can be used for the "multivariate forecasting" task by just predicting each variate independently (which you call "channel independence").
The tsmixer's table you refer to refers to the fact that the model is able to take a multivariate input and output (which is the property as mentioned above of any univariate model since it can take the variates in the batch dimension and predict the multivariate vector independently as done here). I was more referring to
and
which was the source of my confusion which is cleared I suppose.
Also, I apologize for baiting you with the metric remark (which is valid) since I knew you would say that it's from the works you mentioned which I suppose was my point that bad conventions or confusion of notations spread to the point now where anyone reading your paper or others with similar comparisons will not know that the data is actually standardized and the metrics are over the standardized test set (ie NMAE and NMSE). Your paper I believe does not mention this fact and neither do a number of others. Anyways I will close this issue.
@lesego94 go with PatchTST, I had no technical concerns, I was just confused.
from patchtst.
ah sorry, you mention it in appendix B1! My bad I'll stop embarrassing myself for one day! Ah my bad x2 I had the TSmixer paper open...
from patchtst.
Thank you! ill check out your modification aswell.
from patchtst.
Related Issues (20)
- Obtain the MSE of each variable when i do the "M" prediction
- 请问用到了GPU加速吗 HOT 5
- how to use learner.distributed(), in self supervised pretrain code ?
- How does the visualization of Attention Weights organize the code? HOT 2
- scale
- RevIN and StandardScaler HOT 15
- Multivariate Time Series Classification HOT 6
- .
- Error during installation ”Could not find a version that satisfies the requirement numpy==1.21.3“,but the actual version is now 1.26.4
- Performance about self-supervised learning
- Can this model fit unbalanced panel data(more than one individual)?
- 请问论文中的注意力矩阵在代码里怎么输出
- Multivariate predict univariate HOT 1
- Stock Price Forecasting using PatchTST model HOT 3
- Pretrained Models in Huggingface Repository
- Question about not applying inverse_transform HOT 2
- Question about Table 9.
- outputs和batch_y 序列长度问题
- question about attention layer shared weight
- RuntimeError: required rank 4 tensor to use channels_last format
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from patchtst.