GithubHelp home page GithubHelp logo

wenjiedu / pypots Goto Github PK

View Code? Open in Web Editor NEW
667.0 13.0 65.0 2.07 MB

A Python toolbox/library for reality-centric machine/deep learning and data mining on partially-observed time series with PyTorch, including SOTA neural network models for science tasks of imputation, classification, clustering, forecasting & anomaly detection on incomplete (irregularly-sampled) multivariate time series with NaN missing values/data

Home Page: https://pypots.com

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
time-series missing-data missing-values machine-learning partially-observed-time-series time-series-analysis imputation classification forecasting clustering

pypots's Introduction

Welcome to PyPOTS

a Python toolbox for machine learning on Partially-Observed Time Series

Python version powered by Pytorch the latest release version BSD-3 license Community GitHub contributors GitHub Repo stars GitHub Repo forks Code Climate maintainability Coveralls coverage GitHub Testing Docs building Conda downloads PyPI downloads arXiv DOI

⦿ Motivation: Due to all kinds of reasons like failure of collection sensors, communication error, and unexpected malfunction, missing values are common to see in time series from the real-world environment. This makes partially-observed time series (POTS) a pervasive problem in open-world modeling and prevents advanced data analysis. Although this problem is important, the area of machine learning on POTS still lacks a dedicated toolkit. PyPOTS is created to fill in this blank.

⦿ Mission: PyPOTS (pronounced "Pie Pots") is born to become a handy toolbox that is going to make machine learning on POTS easy rather than tedious, to help engineers and researchers focus more on the core problems in their hands rather than on how to deal with the missing parts in their data. PyPOTS will keep integrating classical and the latest state-of-the-art machine learning algorithms for partially-observed multivariate time series. For sure, besides various algorithms, PyPOTS is going to have unified APIs together with detailed documentation and interactive examples across algorithms as tutorials.

🤗 Please star this repo to help others notice PyPOTS if you think it is a useful toolkit. Please properly cite PyPOTS in your publications if it helps with your research. This really means a lot to our open-source research. Thank you!

The rest of this readme file is organized as follows: ❖ PyPOTS Ecosystem, ❖ Installation, ❖ Available Algorithms, ❖ Usage, ❖ Citing PyPOTS, ❖ Contribution, ❖ Community.

❖ PyPOTS Ecosystem

At PyPOTS, things are related to coffee, which we're familiar with. Yes, this is a coffee universe! As you can see, there is a coffee pot in the PyPOTS logo. And what else? Please read on ;-)

TSDB logo

👈 Time series datasets are taken as coffee beans at PyPOTS, and POTS datasets are incomplete coffee beans with missing parts that have their own meanings. To make various public time-series datasets readily available to users, Time Series Data Beans (TSDB) is created to make loading time-series datasets super easy! Visit TSDB right now to know more about this handy tool 🛠, and it now supports a total of 168 open-source datasets!

PyGrinder logo

👉 To simulate the real-world data beans with missingness, the ecosystem library PyGrinder, a toolkit helping grind your coffee beans into incomplete ones, is created. Missing patterns fall into three categories according to Robin's theory1: MCAR (missing completely at random), MAR (missing at random), and MNAR (missing not at random). PyGrinder supports all of them and additional functionalities related to missingness. With PyGrinder, you can introduce synthetic missing values into your datasets with a single line of code.

BrewPOTS logo

👈 Now we have the beans, the grinder, and the pot, how to brew us a cup of coffee? Tutorials are necessary! Considering the future workload, PyPOTS tutorials are released in a single repo, and you can find them in BrewPOTS. Take a look at it now, and learn how to brew your POTS datasets!


☕️ Welcome to the universe of PyPOTS. Enjoy it and have fun!

❖ Installation

You can refer to the installation instruction in PyPOTS documentation for a guideline with more details.

PyPOTS is available on both PyPI and Anaconda. You can install PyPOTS as shown below:

# via pip
pip install pypots            # the first time installation
pip install pypots --upgrade  # update pypots to the latest version
# install from the latest source code with the latest features but may be not officially released yet
pip install https://github.com/WenjieDu/PyPOTS/archive/main.zip

# via conda
conda install -c conda-forge pypots  # the first time installation
conda update  -c conda-forge pypots  # update pypots to the latest version

❖ Available Algorithms

PyPOTS supports imputation, classification, clustering, forecasting, and anomaly detection tasks on multivariate partially-observed time series with missing values. The table below shows the availability of each algorithm in PyPOTS for different tasks. The symbol ✅ indicates the algorithm is available for the corresponding task (note that models may support tasks in the future that are not currently supported). The task types are abbreviated as follows: IMPU: Imputation; FORE: Forecasting; CLAS: Classification; CLUS: Clustering; ANOD: Anomaly Detection. The paper references are all listed at the bottom of this readme file.

🌟 Since v0.2, all neural-network models in PyPOTS has got hyperparameter-optimization support. This functionality is implemented with the Microsoft NNI framework. You may want to refer to our time-series imputation survey repo Awesome_Imputation to see how to config and tune the hyperparameters.
🔥 Note that Transformer, Crossformer, PatchTST, DLinear, ETSformer, FEDformer, Informer, Autoformer are not proposed as imputation methods in their original papers, and they cannot accept POTS as input. To make them applicable on POTS data, we apply the embedding strategy and training approach (ORT+MIT) the same as we did in SAITS paper.

Type Algo IMPU FORE CLAS CLUS ANOD Year
Neural Net SAITS2 2023
Neural Net Crossformer3 2023
Neural Net TimesNet4 2023
Neural Net PatchTST5 2023
Neural Net DLinear6 2023
Neural Net ETSformer7 2023
Neural Net FEDformer8 2022
Neural Net Raindrop9 2022
Neural Net Informer10 2021
Neural Net Autoformer11 2021
Neural Net CSDI12 2021
Neural Net US-GAN13 2021
Neural Net CRLI14 2021
Probabilistic BTTF15 2021
Neural Net GP-VAE3 2020
Neural Net VaDER16 2019
Neural Net M-RNN17 2019
Neural Net BRITS18 2018
Neural Net GRU-D19 2018
Neural Net Transformer20 2017
Naive LOCF/NOCB
Naive Mean
Naive Median

❖ Usage

Besides BrewPOTS, you can also find a simple and quick-start tutorial notebook on Google Colab Colab tutorials . If you have further questions, please refer to PyPOTS documentation docs.pypots.com. You can also raise an issue or ask in our community.

We present you a usage example of imputing missing values in time series with PyPOTS below, you can click it to view.

Click here to see an example applying SAITS on PhysioNet2012 for imputation:
# Data preprocessing. Tedious, but PyPOTS can help.
import numpy as np
from sklearn.preprocessing import StandardScaler
from pygrinder import mcar
from pypots.data import load_specific_dataset
data = load_specific_dataset('physionet_2012')  # PyPOTS will automatically download and extract it.
X = data['X']
num_samples = len(X['RecordID'].unique())
X = X.drop(['RecordID', 'Time'], axis = 1)
X = StandardScaler().fit_transform(X.to_numpy())
X = X.reshape(num_samples, 48, -1)
X_ori = X  # keep X_ori for validation
X = mcar(X, 0.1)  # randomly hold out 10% observed values as ground truth
dataset = {"X": X}  # X for model input
print(X.shape)  # (11988, 48, 37), 11988 samples and each sample has 48 time steps, 37 features

# Model training. This is PyPOTS showtime.
from pypots.imputation import SAITS
from pypots.utils.metrics import calc_mae
saits = SAITS(n_steps=48, n_features=37, n_layers=2, d_model=256, d_ffn=128, n_heads=4, d_k=64, d_v=64, dropout=0.1, epochs=10)
# Here I use the whole dataset as the training set because ground truth is not visible to the model, you can also split it into train/val/test sets
saits.fit(dataset)  # train the model on the dataset
imputation = saits.impute(dataset)  # impute the originally-missing values and artificially-missing values
indicating_mask = np.isnan(X) ^ np.isnan(X_ori)  # indicating mask for imputation error calculation
mae = calc_mae(imputation, np.nan_to_num(X_ori), indicating_mask)  # calculate mean absolute error on the ground truth (artificially-missing values)
saits.save("save_it_here/saits_physionet2012.pypots")  # save the model for future use
saits.load("save_it_here/saits_physionet2012.pypots")  # reload the serialized model file for following imputation or training

❖ Citing PyPOTS

Tip

[Updates in Feb 2024] 😎 Our survey paper Deep Learning for Multivariate Time Series Imputation: A Survey has been released on arXiv. The code is open source in the GitHub repo Awesome_Imputation. We comprehensively review the literature of the state-of-the-art deep-learning imputation methods for time series, provide a taxonomy for them, and discuss the challenges and future directions in this field.

[Updates in Jun 2023] 🎉 A short version of the PyPOTS paper is accepted by the 9th SIGKDD international workshop on Mining and Learning from Time Series (MiLeTS'23)). Additionally, PyPOTS has been included as a PyTorch Ecosystem project.

The paper introducing PyPOTS is available on arXiv at this URL, and we are pursuing to publish it in prestigious academic venues, e.g. JMLR (track for Machine Learning Open Source Software). If you use PyPOTS in your work, please cite it as below and 🌟star this repository to make others notice this library. 🤗

There are scientific research projects using PyPOTS and referencing in their papers. Here is an incomplete list of them.

@article{du2023pypots,
title={{PyPOTS: a Python toolbox for data mining on Partially-Observed Time Series}},
author={Wenjie Du},
year={2023},
eprint={2305.18811},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2305.18811},
doi={10.48550/arXiv.2305.18811},
}

or

Wenjie Du. (2023). PyPOTS: a Python toolbox for data mining on Partially-Observed Time Series. arXiv, abs/2305.18811.https://arxiv.org/abs/2305.18811

❖ Contribution

You're very welcome to contribute to this exciting project!

By committing your code, you'll

  1. make your well-established model out-of-the-box for PyPOTS users to run, and help your work obtain more exposure and impact. Take a look at our inclusion criteria. You can utilize the template folder in each task package (e.g. pypots/imputation/template) to quickly start;
  2. become one of PyPOTS contributors and be listed as a volunteer developer on the PyPOTS website;
  3. get mentioned in our release notes;

You can also contribute to PyPOTS by simply staring🌟 this repo to help more people notice it. Your star is your recognition to PyPOTS, and it matters!

👏 Click here to view PyPOTS stargazers and forkers.
We're so proud to have more and more awesome users, as well as more bright ✨stars:
PyPOTS stargazers
PyPOTS forkers

👀 Check out a full list of our users' affiliations on PyPOTS website here!

❖ Community

We care about the feedback from our users, so we're building PyPOTS community on

  • Slack. General discussion, Q&A, and our development team are here;
  • LinkedIn. Official announcements and news are here;
  • WeChat (微信公众号). We also run a group chat on WeChat, and you can get the QR code from the official account after following it;

If you have any suggestions or want to contribute ideas or share time-series related papers, join us and tell. PyPOTS community is open, transparent, and surely friendly. Let's work together to build and improve PyPOTS!

🏠 Visits PyPOTS visits

Footnotes

  1. Rubin, D. B. (1976). Inference and missing data. Biometrika.

  2. Du, W., Cote, D., & Liu, Y. (2023). SAITS: Self-Attention-based Imputation for Time Series. Expert systems with applications.

  3. Zhang, Y., & Yan, J. (2023). Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. ICLR 2023. 2

  4. Wu, H., Hu, T., Liu, Y., Zhou, H., Wang, J., & Long, M. (2023). TimesNet: Temporal 2d-variation modeling for general time series analysis. ICLR 2023

  5. Nie, Y., Nguyen, N. H., Sinthong, P., & Kalagnanam, J. (2023). A time series is worth 64 words: Long-term forecasting with transformers. ICLR 2023

  6. Zeng, A., Chen, M., Zhang, L., & Xu, Q. (2023). Are transformers effective for time series forecasting?. AAAI 2023

  7. Woo, G., Liu, C., Sahoo, D., Kumar, A., & Hoi, S. (2023). ETSformer: Exponential Smoothing Transformers for Time-series Forecasting. ICLR 2023

  8. Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., & Jin, R. (2022). FEDformer: Frequency enhanced decomposed transformer for long-term series forecasting. ICML 2022.

  9. Zhang, X., Zeman, M., Tsiligkaridis, T., & Zitnik, M. (2022). Graph-Guided Network for Irregularly Sampled Multivariate Time Series. ICLR 2022.

  10. Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2021). Informer: Beyond efficient transformer for long sequence time-series forecasting. AAAI 2021.

  11. Wu, H., Xu, J., Wang, J., & Long, M. (2021). Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. NeurIPS 2021.

  12. Tashiro, Y., Song, J., Song, Y., & Ermon, S. (2021). CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation. NeurIPS 2021.

  13. Miao, X., Wu, Y., Wang, J., Gao, Y., Mao, X., & Yin, J. (2021). Generative Semi-supervised Learning for Multivariate Time Series Imputation. AAAI 2021.

  14. Ma, Q., Chen, C., Li, S., & Cottrell, G. W. (2021). Learning Representations for Incomplete Time Series Clustering. AAAI 2021.

  15. Chen, X., & Sun, L. (2021). Bayesian Temporal Factorization for Multidimensional Time Series Prediction. IEEE transactions on pattern analysis and machine intelligence.

  16. Jong, J.D., Emon, M.A., Wu, P., Karki, R., Sood, M., Godard, P., Ahmad, A., Vrooman, H.A., Hofmann-Apitius, M., & Fröhlich, H. (2019). Deep learning for clustering of multivariate clinical patient trajectories with missing values. GigaScience.

  17. Yoon, J., Zame, W. R., & van der Schaar, M. (2019). Estimating Missing Data in Temporal Data Streams Using Multi-Directional Recurrent Neural Networks. IEEE Transactions on Biomedical Engineering.

  18. Cao, W., Wang, D., Li, J., Zhou, H., Li, L., & Li, Y. (2018). BRITS: Bidirectional Recurrent Imputation for Time Series. NeurIPS 2018.

  19. Che, Z., Purushotham, S., Cho, K., Sontag, D.A., & Liu, Y. (2018). Recurrent Neural Networks for Multivariate Time Series with Missing Values. Scientific Reports.

  20. Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. NeurIPS 2017.

pypots's People

Contributors

augustjw avatar maciejskrabski avatar vemuribv avatar wenjiedu avatar yhzhu99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pypots's Issues

Enabling users to customize loss function

1. Feature description

Add a feature to enable users to specify their own loss functions, which should be callable python functions.

2. Motivation

Currently the loss functions in PyPOTS models are fixed. This definitely has no problem because reproducing algorithms and models exactly is an important part in PyPOTS project. Algorithms and models should be kept as same as possible with the descriptions from original papers.

However, users from time to time have to specify the loss function for better optimization. For example, in some scenarios, users use MAE as their evaluation metric to assess the imputation accuracy, while some other users with their applications prefer to use MSE to evaluate the final imputation results. From the perspective of helping our users get better results, we should add such a feature.

3. Your contribution

Will create a PR to finish it.

error when trying to train raindrop classification on multiple gpu

1. System Info + Information

system info: torch 2.0.1, pypots 0.1.1 - gpu: 8x RTX 4090
problem: when training the raindrop model as usual i wanted to make use of all my gpus. I did everything as in the documentation but got the following error after changing the device variable to a list-
Thanks a lot!

2. Reproduction

raindrop = Raindrop( n_steps = X.shape[1], n_features = X.shape[2], ... num_workers = 8, ... device = ['cuda:0', 'cuda:1'], ... )

Unbenannt

4. Expected behavior

no error

Adding new dataset

1. Feature description

Hi,

The data is from coastal base stations that monitor vessel traffic. Irregularly sampled and with missing values. Automatic identification system data in this case contains 8 columns and approximately 10 000 samples per vessel. There are approximately 200 vessels per .csv file. I have a dozen of them for the area in Norway near Ålesund.
ais_20201118.zip

2. Motivation

This could be a fun playground for data imputation and prediction.

3. Your contribution

I believe I could help with coding, but this would be my first pull request. I'm trying to implement the BRITS model for data imputation, so at the moment, I'm preparing it for training.

Separate optimizers and models

1. Feature description

To make the framework more usable, we should separate models and optimizers, for example:

saits=SAITS(
n_steps=48,
n_features=37,
n_layers=3,
...
optimizer=pypots.optim.Adam(),
...
)

2. Motivation

Such a design can give users more options, and make PyPOTS framework more powerful. For example, we can add more functionalities into pypots.optim.Optimizer classes, like lr scheduler.

3. Your contribution

Will make a PR to implement it.

Add access for explainability (TimeSHAP)

1. Feature description

Changes to the classify() and forward() methods to make the models compatible with TimeSHAP and other XAI methods.

2. Motivation

I would like to be able to understand which timesteps and features were most relevant for a particular prediction.

3. Your contribution

I don't know enough about how the models are structured to feel confident in changing the api.

PyPOTS needs a tool like `pypots-cli env`

Feature description

pypots-cli should include an environment tool to help users and developers easily install dependencies.

Motivation

It could be very useful, considering the situation I reported in discussion #58 that torch_geometirc and related dependencies may be hard to install for our users. Hence, with pypots-cli env, running a simple command like pypots-cli env --install optional should help install dependencies presented in file setup.cfg that include torch_geometirc.

Your contribution

Will create a PR to add this feature and link it with this issue.

Add git hooks for linting before code committing or push

Feature description

We run a code-linting workflow on pushes. Although we already have a lint-checking tool in pypots-cli dev --lint-code, I figure out that I sometimes forget to run it and commit code with lints. This could result in unnecessary troubles like code modification and re-pushing. Hence, we should have some git hooks for PyPOTS project to help solve this problem, tools like pre-commit may be useful.

Motivation

See the description above. We don't want to put all checking and testing things on CI and cloud. We need some more normative rules for code development here.

Your contribution

Will create a PR and link it to this issue.

Adding functions to process time-series datasets with sliding-window and save into h5 files

1. Feature description

We need some utilities to help with data processing

  1. adding a sliding-window function to process 2-D [n_steps, n_features] datasets into 3-D [n_samples, n_steps, n_features] datasets for model input;
  2. adding a data saving function to save datasets (organized as dictionaries) into hdf5 files for the data lazy-loading strategy;

2. Motivation

For ease of usage.

3. Your contribution

Will create a PR to add them.

Log file output

1. Feature description

Allow training logs to be output to a file

2. Motivation

Adding the functionality to output logs to a file would be better

3. Your contribution

I will try to help

GPU enabled model raises Exception: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0

Hello,
great library, but using gpu enabled machine results in errors.

pypots version = 0.0.6 (the one available in PyPI)

code to replicate problem:

import unittest
from pypots.tests.test_imputation import TestBRITS, TestLOCF, TestSAITS, TestTransformer
from pypots import __version__


if __name__ == "__main__":
    print(__version__)
    unittest.main()

results:

0.0.6
Running test cases for BRITS...
Model initialized successfully. Number of the trainable parameters: 580976
ERunning test cases for BRITS...
Model initialized successfully. Number of the trainable parameters: 580976
ERunning test cases for LOCF...
LOCF test_MAE: 0.1712224306027283
.Running test cases for LOCF...
.Running test cases for SAITS...
Model initialized successfully. Number of the trainable parameters: 1332704
Exception: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0
ERunning test cases for SAITS...
Model initialized successfully. Number of the trainable parameters: 1332704
Exception: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0
ERunning test cases for Transformer...
Model initialized successfully. Number of the trainable parameters: 666122
epoch 0: training loss 0.7681, validating loss 0.2941
epoch 1: training loss 0.4731, validating loss 0.2395
epoch 2: training loss 0.4235, validating loss 0.2069
epoch 3: training loss 0.3781, validating loss 0.1914
epoch 4: training loss 0.3530, validating loss 0.1837
ERunning test cases for Transformer...
Model initialized successfully. Number of the trainable parameters: 666122
epoch 0: training loss 0.7826, validating loss 0.2820
epoch 1: training loss 0.4687, validating loss 0.2352
epoch 2: training loss 0.4188, validating loss 0.2132
epoch 3: training loss 0.3857, validating loss 0.1977
epoch 4: training loss 0.3604, validating loss 0.1945
E
======================================================================
ERROR: test_impute (pypots.tests.test_imputation.TestBRITS)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 99, in setUp
    self.brits.fit(self.train_X, self.val_X)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/brits.py", line 494, in fit
    training_set = DatasetForBRITS(train_X)  # time_gaps is necessary for BRITS
  File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 62, in __init__
    forward_delta = parse_delta(forward_missing_mask)
  File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 36, in parse_delta
    delta.append(torch.ones(1, n_features) + (1 - m_mask[step]) * delta[-1])
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

======================================================================
ERROR: test_parameters (pypots.tests.test_imputation.TestBRITS)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 99, in setUp
    self.brits.fit(self.train_X, self.val_X)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/brits.py", line 494, in fit
    training_set = DatasetForBRITS(train_X)  # time_gaps is necessary for BRITS
  File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 62, in __init__
    forward_delta = parse_delta(forward_missing_mask)
  File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 36, in parse_delta
    delta.append(torch.ones(1, n_features) + (1 - m_mask[step]) * delta[-1])
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

======================================================================
ERROR: test_impute (pypots.tests.test_imputation.TestSAITS)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 83, in _train_model
    results = self.model.forward(inputs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 95, in forward
    imputed_data, [X_tilde_1, X_tilde_2, X_tilde_3] = self.impute(inputs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 62, in impute
    enc_output, _ = encoder_layer(enc_output)
  File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 122, in forward
    enc_output, attn_weights = self.slf_attn(enc_input, enc_input, enc_input, attn_mask=mask_time)
  File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 72, in forward
    v, attn_weights = self.attention(q, k, v, attn_mask)
  File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 32, in forward
    attn = attn.masked_fill(attn_mask == 1, -1e9)
RuntimeError: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 35, in setUp
    self.saits.fit(self.train_X, self.val_X)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 171, in fit
    self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 123, in _train_model
    raise RuntimeError('Training got interrupted. Model was not get trained. Please try fit() again.')
RuntimeError: Training got interrupted. Model was not get trained. Please try fit() again.

======================================================================
ERROR: test_parameters (pypots.tests.test_imputation.TestSAITS)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 83, in _train_model
    results = self.model.forward(inputs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 95, in forward
    imputed_data, [X_tilde_1, X_tilde_2, X_tilde_3] = self.impute(inputs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 62, in impute
    enc_output, _ = encoder_layer(enc_output)
  File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 122, in forward
    enc_output, attn_weights = self.slf_attn(enc_input, enc_input, enc_input, attn_mask=mask_time)
  File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 72, in forward
    v, attn_weights = self.attention(q, k, v, attn_mask)
  File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 32, in forward
    attn = attn.masked_fill(attn_mask == 1, -1e9)
RuntimeError: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 35, in setUp
    self.saits.fit(self.train_X, self.val_X)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 171, in fit
    self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 123, in _train_model
    raise RuntimeError('Training got interrupted. Model was not get trained. Please try fit() again.')
RuntimeError: Training got interrupted. Model was not get trained. Please try fit() again.

======================================================================
ERROR: test_impute (pypots.tests.test_imputation.TestTransformer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 68, in setUp
    self.transformer.fit(self.train_X, self.val_X)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 257, in fit
    self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 129, in _train_model
    if np.equal(self.best_loss, float('inf')):
  File "mydirs(...)/python3.9/site-packages/torch/_tensor.py", line 732, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

======================================================================
ERROR: test_parameters (pypots.tests.test_imputation.TestTransformer)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 68, in setUp
    self.transformer.fit(self.train_X, self.val_X)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 257, in fit
    self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
  File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 129, in _train_model
    if np.equal(self.best_loss, float('inf')):
  File "mydirs(...)/python3.9/site-packages/torch/_tensor.py", line 732, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

----------------------------------------------------------------------
Ran 8 tests in 20.239s

FAILED (errors=6)

i suspect that you call .to(device) too early on data. You might also override device parameter when initiating new tensors (i.e. in torch.ones in parse_delta)

Best regards!

Parallel training on multi GPUs

1. Feature description

Enable to train PyPOTS NN models on multiple CUDA device parallely.

Parallel training on multiple GPUs for acceleration is useful, and this feature is on our list, but without priority. Mainly because:

  1. If your dataset is very large, PyPOTS provides a data lazy-loading strategy to help you only load necessary data samples during training. Simply using multiple GPU devices for training cannot ease the memory load, because your data still has to be loaded into RAM first for distributing to GPUs;
  2. Different from LLMs, neural network models for time series modeling usually are not large models. A single GPU can accelerate the training to a good speed. So far, you even can run all models in PyPOTS on your laptop with CPUs at an acceptable training and inference speed. Especially, nowadays laptops generally have at least 4 cores. I’m not saying training on multiple GPUs is useless. In some extreme scenes, it can be very helpful;
    Recently, this feature was requested by a member of our community who is using PyPOTS to train a GRU-D model for a POTS classification task, the training takes too much time (even after trying to increase the value of num_workers) and one has 4 GPUs on the machine but cannot use them for parallel training to speed up. Therefore, I'm considering adding the feature of parallel training in the following release. I implemented it with DataParallel, but PyTorch suggests using DistributedDataParallel https://pytorch.org/tutorials/intermediate/ddp_tutorial.html#comparison-between-dataparallel-and-distributeddataparallel. As I mentioned above, I think this is not a necessary feature so I postpone the redesign.

2. Motivation

Speed up the training process.

3. Your contribution

Will make a PR to add this feature.

[Feature request] Is it possible to "warm-up" the transformer?

Thank you for creating this wonderful resource! This is an amazing and useful tool!

Regarding SAITS, is it possible to pass a learning rate scheduler, rather than a fixed learning rate, for the transformer to pre-train?

I ask this because I compared the outputs of training 100 epochs vs 1000 epochs. The loss continues to decrease, but the error on holdout timepoints does not change between 100 vs 1000 epochs. Strangely, the prediction (after 100 & 1000 epochs) is less accurate than linear interpolation...! I wondered if it is because the transformers have too many parameters, and it needs some help learning initially.

Numpy is not available error

Discussed in #31

Originally posted by lauredecaudin January 25, 2023
Hello !

I tried to use the package with the code found in the README,

import numpy as np
from sklearn.preprocessing import StandardScaler
from pypots.data import load_specific_dataset, mcar, masked_fill
from pypots.imputation import SAITS
from pypots.utils.metrics import cal_mae
# Data preprocessing. Tedious, but PyPOTS can help. 🤓
data = load_specific_dataset('physionet_2012')  # PyPOTS will automatically download and extract it.
X = data['X']
num_samples = len(X['RecordID'].unique())
X = X.drop('RecordID', axis = 1)
X = StandardScaler().fit_transform(X.to_numpy())
X = X.reshape(num_samples, 48, -1)
X_intact, X, missing_mask, indicating_mask = mcar(X, 0.1) # hold out 10% observed values as ground truth
X = masked_fill(X, 1 - missing_mask, np.nan)
# Model training. This is PyPOTS showtime. 💪
saits = SAITS(n_steps=48, n_features=37, n_layers=2, d_model=256, d_inner=128, n_head=4, d_k=64, d_v=64, dropout=0.1, epochs=10)
saits.fit(X)  # train the model. Here I use the whole dataset as the training set, because ground truth is not visible to the model.
imputation = saits.impute(X)  # impute the originally-missing values and artificially-missing values
mae = cal_mae(imputation, X_intact, indicating_mask)

And I get this error :

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<command-282728034361361> in <module>
     15 # Model training. This is PyPOTS showtime. 💪
     16 saits = SAITS(n_steps=48, n_features=37, n_layers=2, d_model=256, d_inner=128, n_head=4, d_k=64, d_v=64, dropout=0.1, epochs=10)
---> 17 saits.fit(X)  # train the model. Here I use the whole dataset as the training set, because ground truth is not visible to the model.
     18 imputation = saits.impute(X)  # impute the originally-missing values and artificially-missing values
     19 mae = cal_mae(imputation, X_intact, indicating_mask)

/databricks/python/lib/python3.8/site-packages/pypots/imputation/saits.py in fit(self, train_X, val_X)
    216 
    217     def fit(self, train_X, val_X=None):
--> 218         train_X = self.check_input(self.n_steps, self.n_features, train_X)
    219         if val_X is not None:
    220             val_X = self.check_input(self.n_steps, self.n_features, val_X)

/databricks/python/lib/python3.8/site-packages/pypots/base.py in check_input(self, expected_n_steps, expected_n_features, X, y, out_dtype)
     76                 X = torch.tensor(X).to(self.device)
     77             elif is_array:
---> 78                 X = torch.from_numpy(X).to(self.device)
     79             else:  # is tensor
     80                 X = X.to(self.device)

RuntimeError: Numpy is not available

Anyone knows what's happening ?

I use Databricks with a cluster 10.4 LTS https://docs.databricks.com/release-notes/runtime/10.4.html
I installed PyPots with pip install pypots

Thanks

Autocorrelation

Issue description

Which parameters in SAITS helps to improve autocorrelation modelling? Thanks :)

Should add Pull-Request template

1. Feature description

PyPOTS needs a PR template.

2. Motivation

To make PRs to PyPOTS standardized. Also, a template can guide our contributors.

3. Your contribution

Will make a PR and link to this issue.

Question about imputation

Hi!

In the example you provided, the following code is used to impute the originally-missing values and artificially-missing values

imputation = saits.impute(X)

After I converted imputation to DataFrame, I compared it with the original data and found that there was a big difference. Did I make a mistake? What should I do?

I used the following code to compare the original data and imputation

data['X'].head(5)

pd.DataFrame(imputation.reshape(-1, 37)).head(5)

CI-testing failed because of `protobuf`

Issue description

I notice that we're having some tasks failed in our CI-testing workflow, for example, https://github.com/WenjieDu/PyPOTS/actions/runs/5064524248/jobs/9092539446.

/usr/share/miniconda/envs/pypots-test/lib/python3.8/site-packages/google/protobuf/descriptor.py:51: in <module>
    from google.protobuf.pyext import _message
E   ImportError: /usr/share/miniconda/envs/pypots-test/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN6google8protobuf2io17SafeDoubleToFloatEd

I have investigated this error and it has nothing to do with code in PyPOTS. The issue source is that these failed jobs have their conda installing a newer version of protobuf which is not compatible with other dependencies. It can be directly solved by specifying the version of protobuf as 4.21.12, and I have tested it.

BRITS imputation test fails on cuda device mismatch

Hi,
when trying to run imputation tests with commit 6dcc894 on dev branch.

py3.9_cuda11.3_cudnn8.2.0_0

$ python -m pytest tests/test_imputation.py

./tests/test_imputation.py::TestBRITS::test_parameters Failed with Error: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
  File ".../unittest/case.py", line 59, in testPartExecutor
    yield
  File ".../unittest/case.py", line 588, in run
    self._callSetUp()
  File ".../unittest/case.py", line 547, in _callSetUp
    self.setUp()
  File ".../PyPOTS/pypots/tests/test_imputation.py", line 98, in setUp
    self.brits.fit(self.train_X, self.val_X)
  File "/PyPOTS/pypots/imputation/brits.py", line 504, in fit
    self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
  File "/PyPOTS/pypots/imputation/base.py", line 154, in _train_model
    if np.equal(self.best_loss, float("inf")):
  File .../lib/python3.9/site-packages/torch/_tensor.py", line 732, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Separate each model into a single package

1. Feature description

PyPOTS may should separate each model into a single package. Take SAITS as an example, its package structure should be like below

├── pypots
│	├── imputation
│	│	├── saits
│	│	│	├── __init__.py
│	│	│	├── dataset.py
│	│	│	├── model.py
│	│	│	└── module.py
  • model.py includes the main model/algorithm and the wrapper exposed to users;
  • module.py includes layers and modules for the main model/algorithm if necessary;
  • dataset.py includes specifically-designed class Dataset for this model's data processing;

2. Motivation

To make the library more standardized, and for easier management.

3. Your contribution

Will create a PR to finish this.

Some problems in the demo

Question description

image
I tried for the demo you have shown in the README.But It seems have some problems when run the model.How can I fix it? Thank you!

Add templates for contributors to add new models

1. Feature description

All models have been separated into a single package in PR #86 for better standardization and easier management. I think we need templates to guide our contributors to add their models.

2. Motivation

This can make it less complicated for contributors to integrate their models into PyPOTS.

3. Your contribution

Will make a PR.

model giving same output prediction

Issue description

hello,
I have the problem that my trained models gives me the same output for no matter which input I have.
Is it a problem with the data or maybe the model/my training of the model?
So here are the labels & it's amount in the dataset:
nlabels

Now my problem is tht when I use the classify feature to classify my test dataset the only label which raindrop predicts is 0:

plot_raindrop_predictions
raindrop_prediction2

I have only trained the model for some epochs to check it because training is pricy :), but may it be that this causes this issue or should I change the data preprocessing?

learning-rate and pretrained model of SAITS

Hello, Wenjie,

I tried the PyPOTS with, it awesome! But I have following questions:
(1) During training with SAITS model, I found the learning-rate is recommend to lr = 0.00068277455043675505 in ‘PhysioNet2012_SAITS_best.ini’ file. I am wondering if there are some great methods to get such a learning-rate? (I only know to set 0.001, 0.0001 or such kind of stuffy numbers)
(2) if there are some possible to release the pretrained state_dict .pth file of SAITS(base) and SAITS? Because during training with my custom dataset, I encounter with an early-stop problem inside of 100 epochs, so I decide to see if there will be the same problem with PhysioNet2012 of epochs = 10000.
Or the training log files of SAITS(base) and SAITS would be helpful !

Thank you very much for your reply !

Early stop

Wenjie,

I tried the PyPOTS with the Beijing Air quality database. For the dataset preparation, I follow the gene_UCI_BeijingAirQuality_dataset. The following is the PyPOTS setup.

saits_base = SAITS(seq_len=seq_len, n_features=132, 
                   n_layers=2,  # num of group-inner layers
                   d_model=256, # model hidden dim
                   d_inner=128, # hidden size of feed forward layer
                   n_head=4, # head num of self-attention
                   d_k=64, d_v=64, # key dim, value dim
                   dropout=0, 
                   epochs=200,
                   patience=30,
                   batch_size=32,
                   weight_decay=1e-5,
                   ORT_weight=1,
                   MIT_weight=1,
                  )

saits_base.fit(train_set_X)

PyPOTS stops earlier than the epochs specified (stops around epoch 80), without triggering either print('Exceeded the training patience. Terminating the training procedure...') or print('Finished all training epochs.').

epoch 0: training loss 0.9637 
epoch 1: training loss 0.6161 
epoch 2: training loss 0.5177 
epoch 3: training loss 0.4783 
epoch 4: training loss 0.4489 
...
epoch 73: training loss 0.2462 
epoch 74: training loss 0.2460 
epoch 75: training loss 0.2480 
epoch 76: training loss 0.2452 
epoch 77: training loss 0.2452 
epoch 78: training loss 0.2458 
epoch 79: training loss 0.2449 
epoch 80: training loss 0.2423 
epoch 81: training loss 0.2425 
epoch 82: training loss 0.2443 
epoch 83: training loss 0.2403 
epoch 84: training loss 0.2406

Then I evaluate the model performance (not knowing why the model stops early) on test_set as

test_set_mae = cal_mae(test_set_imputation, test_set_X_intact, test_set_indicating_mask)
0.21866121846582318

I have a few questions:

  1. What could be the cause for the early stop?
  2. In addition, is there any object in saits_base that stores the loss history?
  3. Does the function cal_mae calculate the same MAE in your paper? For this Beijing air quality case, I should be able to tune the hyperparameter to get the test_set_mae down to around 0.146?

Thank you,
Haochen

name 'MessagePassing' is not defined pypots0.0.9

File "/home/ubuntu/miniconda3/envs/pypots/lib/python3.8/site-packages/pypots/classification/init.py", line 10, in
from pypots.classification.raindrop import Raindrop
File "/home/ubuntu/miniconda3/envs/pypots/lib/python3.8/site-packages/pypots/classification/raindrop.py", line 83, in
class ObservationPropagation(MessagePassing):
NameError: name 'MessagePassing' is not defined

can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

PS C:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main> & C:/Users/Lyc/AppData/Local/Programs/Python/Python39/python.exe c:/Users/Lyc/Downloads/PyPOTS-main/PyPOTS-main/pypots/tests/test_imputation.py
Running test cases for BRITS...
Model initialized successfully. Number of the trainable parameters: 580976
epoch 0: training loss 1.2366, validating loss 0.4201
epoch 1: training loss 0.8974, validating loss 0.3540
epoch 2: training loss 0.7426, validating loss 0.2919
epoch 3: training loss 0.6147, validating loss 0.2414
epoch 4: training loss 0.5411, validating loss 0.2157
ERunning test cases for BRITS...
Model initialized successfully. Number of the trainable parameters: 580976
epoch 0: training loss 1.2054, validating loss 0.4022
epoch 1: training loss 0.8631, validating loss 0.3399
epoch 2: training loss 0.7204, validating loss 0.2863
epoch 3: training loss 0.5995, validating loss 0.2399
epoch 4: training loss 0.5325, validating loss 0.2123
ERunning test cases for LOCF...
LOCF test_MAE: 0.17510570872656786
.Running test cases for LOCF...
.Running test cases for SAITS...
Model initialized successfully. Number of the trainable parameters: 1332704
epoch 0: training loss 0.9181, validating loss 0.2936
epoch 1: training loss 0.6287, validating loss 0.2303
epoch 2: training loss 0.5345, validating loss 0.2086
epoch 3: training loss 0.4735, validating loss 0.1895
epoch 4: training loss 0.4224, validating loss 0.1744
ERunning test cases for SAITS...
Model initialized successfully. Number of the trainable parameters: 1332704
epoch 0: training loss 0.7823, validating loss 0.2779
epoch 1: training loss 0.5015, validating loss 0.2250
epoch 2: training loss 0.4418, validating loss 0.2097
epoch 3: training loss 0.4119, validating loss 0.1994
epoch 4: training loss 0.3866, validating loss 0.1815
ERunning test cases for Transformer...
Model initialized successfully. Number of the trainable parameters: 666122
epoch 0: training loss 0.7715, validating loss 0.2843
epoch 1: training loss 0.4861, validating loss 0.2271
epoch 2: training loss 0.4176, validating loss 0.2077
epoch 3: training loss 0.3822, validating loss 0.2005
epoch 4: training loss 0.3592, validating loss 0.1961
ERunning test cases for Transformer...
Model initialized successfully. Number of the trainable parameters: 666122
epoch 0: training loss 0.8033, validating loss 0.2910
epoch 1: training loss 0.4856, validating loss 0.2345
epoch 2: training loss 0.4282, validating loss 0.2157
epoch 3: training loss 0.3882, validating loss 0.2051
epoch 4: training loss 0.3599, validating loss 0.1942
E

ERROR: test_impute (main.TestBRITS)

Traceback (most recent call last):
File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 125, in setUp
self.brits.fit(self.train_X, self.val_X)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\brits.py", line 504, in fit
self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model
if np.equal(self.best_loss, float('inf')):
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

======================================================================
ERROR: test_parameters (main.TestBRITS)

Traceback (most recent call last):
File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 125, in setUp
self.brits.fit(self.train_X, self.val_X)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\brits.py", line 504, in fit
self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model
if np.equal(self.best_loss, float('inf')):
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

======================================================================
ERROR: test_impute (main.TestSAITS)

Traceback (most recent call last):
File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 45, in setUp
self.saits.fit(self.train_X, self.val_X)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\saits.py", line 170, in fit
self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model
if np.equal(self.best_loss, float('inf')):
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

======================================================================
ERROR: test_parameters (main.TestSAITS)

Traceback (most recent call last):
File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 45, in setUp
self.saits.fit(self.train_X, self.val_X)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\saits.py", line 170, in fit
self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model
if np.equal(self.best_loss, float('inf')):
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

======================================================================
ERROR: test_impute (main.TestTransformer)

Traceback (most recent call last):
File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 89, in setUp
self.transformer.fit(self.train_X, self.val_X)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\transformer.py", line 256, in fit
self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model
if np.equal(self.best_loss, float('inf')):
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

======================================================================
ERROR: test_parameters (main.TestTransformer)

Traceback (most recent call last):
File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 89, in setUp
self.transformer.fit(self.train_X, self.val_X)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\transformer.py", line 256, in fit
self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model
if np.equal(self.best_loss, float('inf')):
File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.


Ran 8 tests in 176.311s

FAILED (errors=6)

Variable sequence length

1. Feature description

To enable variable 'sequence length' of the input data.

2. Motivation

Some of the input training data are composed of multiple concatenated time series of different lengths.

3. Your contribution

I will try to help

The starter tutorial cannot run well. AttributeError: `load_load` function do not exist.

1. System Info

Latest pypots (v0.1.1)
python 3.10
Debian 11

2. Information

  • The official example scripts
  • My own created scripts

3. Reproduction

Simply run the Quick-start Examples

4. Expected behavior

Meet error:

Traceback (most recent call last):
  File "/home/zhuyh/projects/ImputeEHR/pypots.py", line 46, in <module>
    saits.load_load("examples/saits/manually_saved_saits_model")
AttributeError: 'SAITS' object has no attribute 'load_load'

Too low classification metrics when using classification models

1. System Info

According to the feedback from our community, the reported classification metrics (e.g. PR-AUC, ROC-AUC) are too low. This shouldn't be and must be bug.

2. Information

  • The official example scripts
  • My own created scripts

3. Reproduction

A figure from our PyPOTS user.

image

4. Expected behavior

PyPOTS reports very low ROC-AUC, around 0.5.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.