helme / ecg_ptbxl_benchmarking Goto Github PK

View Code? Open in Web Editor NEW

190.0 190.0 83.0 45.98 MB

Public repository associated with "Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL"

License: GNU General Public License v3.0

Python 88.02% Shell 0.61% Jupyter Notebook 11.37%

ecg_ptbxl_benchmarking's People

Contributors

Stargazers

Watchers

ecg_ptbxl_benchmarking's Issues

Readme: How to use pretrained model

Hey there!
Thanks for the paper and the repo those are both great resources. I was wondering if you could write a small howto on fine-tuning one of your pre-trained models. That'd be very helpful.

Best,
Christian

About Fig3 in paper

Hi~ Thank you for your very helpful work！
I want to know the exact values in Fig. 3: Effect of transfer learning from PTB-XL to ICBEB2018 upon varying the size of the ICBEB2018 train- ing set.
Thx！

Hello,I want to ask trhat you only use preprocess and good models to do classfication?

Error in reproducing the results

I have followed the instructions of this page to reproduce the results. The file runs smoothly at first but throws error below:

(normal running code is omitted)

model: fastai_inception1d
aggregating predictions...                                                                                                           
model: fastai_inception1d
aggregating predictions...                                                                                                           
model: fastai_inception1d
aggregating predictions...                                                                                                           
2022-01-25 13:27:06.109375: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-01-25 13:27:06.137879: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2899885000 Hz
2022-01-25 13:27:06.138434: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55bafb1f5e60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-25 13:27:06.138454: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Epoch 1/30
126/137 [==========================>...] - ETA: 0s - loss: 0.2361
Epoch 00001: val_loss improved from inf to 0.11070, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 2ms/step - loss: 0.2278 - val_loss: 0.1107
Epoch 2/30
125/137 [==========================>...] - ETA: 0s - loss: 0.1121
Epoch 00002: val_loss improved from 0.11070 to 0.09673, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.1113 - val_loss: 0.0967
Epoch 3/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0995
Epoch 00003: val_loss improved from 0.09673 to 0.09149, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0992 - val_loss: 0.0915
Epoch 4/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0925
Epoch 00004: val_loss improved from 0.09149 to 0.08855, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0923 - val_loss: 0.0886
Epoch 5/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0881
Epoch 00005: val_loss improved from 0.08855 to 0.08636, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0881 - val_loss: 0.0864
Epoch 6/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0849
Epoch 00006: val_loss improved from 0.08636 to 0.08473, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0847 - val_loss: 0.0847
Epoch 7/30
125/137 [==========================>...] - ETA: 0s - loss: 0.0822
Epoch 00007: val_loss improved from 0.08473 to 0.08354, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0820 - val_loss: 0.0835
Epoch 8/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0803
Epoch 00008: val_loss improved from 0.08354 to 0.08275, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0803 - val_loss: 0.0828
Epoch 9/30
130/137 [===========================>..] - ETA: 0s - loss: 0.0782
Epoch 00009: val_loss improved from 0.08275 to 0.08182, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0782 - val_loss: 0.0818
Epoch 10/30
130/137 [===========================>..] - ETA: 0s - loss: 0.0767
Epoch 00010: val_loss improved from 0.08182 to 0.08148, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0768 - val_loss: 0.0815
Epoch 11/30
131/137 [===========================>..] - ETA: 0s - loss: 0.0754
Epoch 00011: val_loss improved from 0.08148 to 0.08059, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0754 - val_loss: 0.0806
Epoch 12/30
128/137 [===========================>..] - ETA: 0s - loss: 0.0742
Epoch 00012: val_loss improved from 0.08059 to 0.08009, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0743 - val_loss: 0.0801
Epoch 13/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0734
Epoch 00013: val_loss improved from 0.08009 to 0.07966, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0733 - val_loss: 0.0797
Epoch 14/30
127/137 [==========================>...] - ETA: 0s - loss: 0.0723
Epoch 00014: val_loss improved from 0.07966 to 0.07932, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0724 - val_loss: 0.0793
Epoch 15/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0717
Epoch 00015: val_loss improved from 0.07932 to 0.07899, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0716 - val_loss: 0.0790
Epoch 16/30
130/137 [===========================>..] - ETA: 0s - loss: 0.0706
Epoch 00016: val_loss improved from 0.07899 to 0.07862, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0706 - val_loss: 0.0786
Epoch 17/30
124/137 [==========================>...] - ETA: 0s - loss: 0.0701
Epoch 00017: val_loss improved from 0.07862 to 0.07860, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0699 - val_loss: 0.0786
Epoch 18/30
128/137 [===========================>..] - ETA: 0s - loss: 0.0693
Epoch 00018: val_loss improved from 0.07860 to 0.07816, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0694 - val_loss: 0.0782
Epoch 19/30
131/137 [===========================>..] - ETA: 0s - loss: 0.0687
Epoch 00019: val_loss did not improve from 0.07816
137/137 [==============================] - 0s 1ms/step - loss: 0.0688 - val_loss: 0.0782
Epoch 20/30
131/137 [===========================>..] - ETA: 0s - loss: 0.0683
Epoch 00020: val_loss improved from 0.07816 to 0.07803, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0683 - val_loss: 0.0780
Epoch 21/30
128/137 [===========================>..] - ETA: 0s - loss: 0.0676
Epoch 00021: val_loss improved from 0.07803 to 0.07780, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0676 - val_loss: 0.0778
Epoch 22/30
128/137 [===========================>..] - ETA: 0s - loss: 0.0670
Epoch 00022: val_loss did not improve from 0.07780
137/137 [==============================] - 0s 1ms/step - loss: 0.0672 - val_loss: 0.0779
Epoch 23/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0669
Epoch 00023: val_loss improved from 0.07780 to 0.07774, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0668 - val_loss: 0.0777
Epoch 24/30
128/137 [===========================>..] - ETA: 0s - loss: 0.0663
Epoch 00024: val_loss improved from 0.07774 to 0.07758, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0664 - val_loss: 0.0776
Epoch 25/30
131/137 [===========================>..] - ETA: 0s - loss: 0.0659
Epoch 00025: val_loss improved from 0.07758 to 0.07726, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0660 - val_loss: 0.0773
Epoch 26/30
127/137 [==========================>...] - ETA: 0s - loss: 0.0656
Epoch 00026: val_loss did not improve from 0.07726
137/137 [==============================] - 0s 1ms/step - loss: 0.0654 - val_loss: 0.0773
Epoch 27/30
128/137 [===========================>..] - ETA: 0s - loss: 0.0652
Epoch 00027: val_loss did not improve from 0.07726
137/137 [==============================] - 0s 1ms/step - loss: 0.0652 - val_loss: 0.0773
Epoch 28/30
131/137 [===========================>..] - ETA: 0s - loss: 0.0648
Epoch 00028: val_loss improved from 0.07726 to 0.07723, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0647 - val_loss: 0.0772
Epoch 29/30
125/137 [==========================>...] - ETA: 0s - loss: 0.0645
Epoch 00029: val_loss improved from 0.07723 to 0.07714, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0645 - val_loss: 0.0771
Epoch 30/30
129/137 [===========================>..] - ETA: 0s - loss: 0.0641
Epoch 00030: val_loss improved from 0.07714 to 0.07690, saving model to ../output/exp0/models/Wavelet+NN/best_loss_model.h5
137/137 [==============================] - 0s 1ms/step - loss: 0.0640 - val_loss: 0.0769
Traceback (most recent call last):
  File "reproduce_results.py", line 59, in <module>
    main()
  File "reproduce_results.py", line 40, in main
    e.perform()
  File "/home/office-401-1/Downloads/ecg_ptbxl_benchmarking/code/experiments/scp_experiment.py", line 114, in perform
    model.predict(self.X_train).dump(mpath+'y_train_pred.npy')
  File "/home/office-401-1/Downloads/ecg_ptbxl_benchmarking/code/models/wavelet.py", line 158, in predict
    model = load_model(self.outputfolder+'best_loss_model.h5')#'best_score_model.h5', custom_objects={'keras_macro_auroc': keras_macro_auroc})
  File "/home/office-401-1/Documents/Anaconda3/envs/ecg_env/lib/python3.8/site-packages/tensorflow/python/keras/saving/save.py", line 182, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
  File "/home/office-401-1/Documents/Anaconda3/envs/ecg_env/lib/python3.8/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 176, in load_model_from_hdf5
    model_config = json.loads(model_config.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'

Can someone help me?

Preprocessing problem

Hello, thank you for the great work, it is very useful!

I am trying to figure out about ECG data preprocessing. Looking at "raw" PTB-XL dataset I see, that the mean value of ECGs are near 0.0 and std are 0.1 - 0.2, so it differs from e.g. ECG by Apple Watch (it amplitude is much greater than in PTB-XL), so I think than PTB-XL ECG was normalized somehow. So, can you, please, clarify about PTB-XL data preprocessing?

Everything in this repository isn't working at all

I am trying to reproduce your research, but in the end, I only get great disappointment and numerous errors.

Initially, I attempted to import env from the .yml file you provided. Yet, I got the following ResolvePackageNotFound error.

dbus=1.13.6
readline=8.0
libgomp=9.3.0
torchvision=0.5.0
openh264=2.1.1
nettle=3.6
_openmp_mutex=4.5
brunsli=0.1
gnutls=3.6.13
libnghttp2=1.41.0
nss=3.59
ld_impl_linux-64=2.35.1
libuuid=2.32.1
libglu=9.0.0
gmp=6.2.1
x264=1!152.20180806
libxkbcommon=0.10.0
libgfortran5=9.3.0
libedit=3.1.20191231
ncurses=6.2
nspr=4.29
libgcc-ng=9.3.0
gst-plugins-base=1.14.5
libstdcxx-ng=9.3.0
libev=4.33
libgfortran-ng=9.3.0
gstreamer=1.14.5

So, I removed the above packages in the .yml file and sought to bypass this issue.
Again, unfortunately, more errors popped out. Your environment has infinite package conflict that is impossible to solve.

Following link detailing the conflict error while importing your env, there is too much content that I can't even post here.
Conflict Info.

I have never ever seen so many conflicts in any environment till your environment. Be honestly, I even question whether any script can be executed properly in such a poor environment.

Anyway, I still have not given up. I clone my own environment and pip install the fastai librariy.

Then I stuck due to _pickle.UnpicklingError which I mentioned in #24
Moreover, not only reproduce_results.py can't run correctly, but also Finetuning-Example.ipynb not working.

I got the following error while finetuning.

And here is the full text for jpy kernel output:

OSError                                   Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_856\2871713832.py in <cell line: 1>()
----> 1 model.fit(X_train, y_train, X_val, y_val)

D:\訓練模型暫存區\CVDs_with_xResnet\Original_Article_Source_Code\code\models\fastai_model.py in fit(self, X_train, y_train, X_val, y_val)
    280             learn.unfreeze()
    281             lr_find_plot(learn, self.outputfolder,"lr_find"+str(len(layer_groups)))
--> 282             learn.fit_one_cycle(self.epochs_finetuning,slice(lr/1000,lr/10))
    283             losses_plot(learn, self.outputfolder,"losses"+str(len(layer_groups)))
    284 

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\fastai\train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, final_div, wd, callbacks, tot_epochs, start_epoch)
     21     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start,
     22                                        final_div=final_div, tot_epochs=tot_epochs, start_epoch=start_epoch))
---> 23     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
     24 
     25 def fit_fc(learn:Learner, tot_epochs:int=1, lr:float=defaults.lr,  moms:Tuple[float,float]=(0.95,0.85), start_pct:float=0.72,

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\fastai\basic_train.py in fit(self, epochs, lr, wd, callbacks)
    198         else: self.opt.lr,self.opt.wd = lr,wd
    199         callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
--> 200         fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
    201 
    202     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\fastai\basic_train.py in fit(epochs, learn, callbacks, metrics)
     97             cb_handler.set_dl(learn.data.train_dl)
     98             cb_handler.on_epoch_begin()
---> 99             for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
    100                 xb, yb = cb_handler.on_batch_begin(xb, yb)
    101                 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\fastprogress\fastprogress.py in __iter__(self)
     48         except Exception as e:
     49             self.on_interrupt()
---> 50             raise e
     51 
     52     def update(self, val):

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\fastprogress\fastprogress.py in __iter__(self)
     39         if self.total != 0: self.update(0)
     40         try:
---> 41             for i,o in enumerate(self.gen):
     42                 if self.total and i >= self.total: break
     43                 yield o

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\fastai\basic_data.py in __iter__(self)
     73     def __iter__(self):
     74         "Process and returns items from `DataLoader`."
---> 75         for b in self.dl: yield self.proc_batch(b)
     76 
     77     @classmethod

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
    439             return self._iterator
    440         else:
--> 441             return self._get_iterator()
    442 
    443     @property

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\torch\utils\data\dataloader.py in _get_iterator(self)
    386         else:
    387             self.check_worker_number_rationality()
--> 388             return _MultiProcessingDataLoaderIter(self)
    389 
    390     @property

~\anaconda3\envs\env_ptbxl_benchmark\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
   1040             #     before it starts, and __del__ tries to join but will get:
   1041             #     AssertionError: can only join a started process.
-> 1042             w.start()
   1043             self._index_queues.append(index_queue)
   1044             self._workers.append(w)

~\anaconda3\envs\env_ptbxl_benchmark\lib\multiprocessing\process.py in start(self)
    119                'daemonic processes are not allowed to have children'
    120         _cleanup()
--> 121         self._popen = self._Popen(self)
    122         self._sentinel = self._popen.sentinel
    123         # Avoid a refcycle if the target function holds an indirect

~\anaconda3\envs\env_ptbxl_benchmark\lib\multiprocessing\context.py in _Popen(process_obj)
    222     @staticmethod
    223     def _Popen(process_obj):
--> 224         return _default_context.get_context().Process._Popen(process_obj)
    225 
    226 class DefaultContext(BaseContext):

~\anaconda3\envs\env_ptbxl_benchmark\lib\multiprocessing\context.py in _Popen(process_obj)
    325         def _Popen(process_obj):
    326             from .popen_spawn_win32 import Popen
--> 327             return Popen(process_obj)
    328 
    329     class SpawnContext(BaseContext):

~\anaconda3\envs\env_ptbxl_benchmark\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     91             try:
     92                 reduction.dump(prep_data, to_child)
---> 93                 reduction.dump(process_obj, to_child)
     94             finally:
     95                 set_spawning_popen(None)

~\anaconda3\envs\env_ptbxl_benchmark\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

OSError: [Errno 22] Invalid argument

Besides that, I built xresnet model from scratch too. I get close validation loss to your results, nevertheless, confusion matrix for each super-diagnosis showed poor F1 score. And I have no clue how you get such a high auc score.

Number of samples and class memberships

Hi,
I am a little bit confused about the number of samples in the notebook Finetuning-Example.ipynb.
I see 21430 samples in total (train and validation together).
However, the paper physionet argues that there are 21837 records and thats also what I see in the data/ptbxl/records100 folder.
Why are there 407 records missing?

Other question: How can I interpret the label sets y_train and y_val?
Which representation (10000,01000,00100,00010,00001) correspond to which class (normal, MI, STTC, CD, HYP)?
I cannot map them according to the numbers of samples in the classes because they don't match.

Can you help me and clear that up? Thank you very much!

Filtering sparsely populated classes

From the paper it seemed that all the classes have been included: "We deliberately decided not to exclude sparsely populated classes from the evaluation". However, the README here mentions: "In all cases we restrict to classes with more than 50 entries in the whole dataset".

Could you clarify whether the metrics reported here consider all the classes or removes classes with less than 50 samples?

Best,
Priya

"from fastai.core import * " is missing!

Using lib "fastai" is difficult to repeat. How can I run this code?

Found conflicts! Looking for incompatible packages. error

When I run the "conda env create -f ecg_env.yml " command I get the following error.
Found conflicts! Looking for incompatible packages.

LRP on PVC and PACE signals

Hello,
I was reading your article about LRP with e-rule =0.1 on ECG signals but I can not find it in this repository can you share it with me ?
Thanks

Standardizer error

Hi! Thanks for the great repo and paper!! Just wondering if I'm missing something in running the Finetuning notebook. All is good until the preprocessing the data with the standardizer in utils.

import pickle

standard_scaler = pickle.load(open('../output/'+experiment+'/data/standard_scaler.pkl', "rb"))

X_train = utils.apply_standardizer(X_train, standard_scaler)
X_val = utils.apply_standardizer(X_val, standard_scaler)

generates this error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-5fe1be60124f> in <module>
      3 standard_scaler = pickle.load(open('../output/'+experiment+'/data/standard_scaler.pkl', "rb"))
      4 
----> 5 X_train = utils.apply_standardizer(X_train, standard_scaler)
      6 X_val = utils.apply_standardizer(X_val, standard_scaler)

~/Documents/machine-learning/190/ecg/ecg_ptbxl_benchmarking/code/utils/utils.py in apply_standardizer(X, ss)
    329     for x in X:
    330         x_shape = x.shape
--> 331         X_tmp.append(ss.transform(x.flatten()[:,np.newaxis]).reshape(x_shape))
    332     X_tmp = np.array(X_tmp)
    333     return X_tmp

~/.conda/envs/ecg_env/lib/python3.8/site-packages/sklearn/preprocessing/_data.py in transform(self, X, copy)
    789 
    790         copy = copy if copy is not None else self.copy
--> 791         X = self._validate_data(X, reset=False,
    792                                 accept_sparse='csr', copy=copy,
    793                                 estimator=self, dtype=FLOAT_DTYPES,

~/.conda/envs/ecg_env/lib/python3.8/site-packages/sklearn/base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
    434 
    435         if check_params.get('ensure_2d', True):
--> 436             self._check_n_features(X, reset=reset)
    437 
    438         return out

~/.conda/envs/ecg_env/lib/python3.8/site-packages/sklearn/base.py in _check_n_features(self, X, reset)
    370         else:
    371             if not hasattr(self, 'n_features_in_'):
--> 372                 raise RuntimeError(
    373                     "The reset parameter is False but there is no "
    374                     "n_features_in_ attribute. Is this estimator fitted?"

RuntimeError: The reset parameter is False but there is no n_features_in_ attribute. Is this estimator fitted?

Best,
Francis

Use of multiprocessing

I've noticed that whenever multiprocessing.Pool() is called, the number of processes is usually specified in your code, however considering that people have different rigs it will be better to leave it empty or equal to os.cpu_count() with either:

pool = multiprocessing.Pool()

# or this one when using "n_jobs" such as in scp_experiment.py
n_jobs = os.cpu_count()
pool = multiprocessing.Pool(n_jobs)

Additionally, in wavelet.py the pool is never closed in the get_ecg_features() function which causes some problems.

Best,
Eric

SEModule & SimpleSelfAttention problem

Function SEModule() and SimpleSelfAttention() in xresnet1d.py line 143, 144 can not be found. Both function can only be found in fastai version 2.7.9.

But

from fastai.basic_data import *
from fastai.basic_train import *
from fastai.train import *
from fastai.metrics import *
from fastai.torch_core import *
from fastai.callbacks.tracker import SaveModelCallback
from fastai.callback import Callback

can not be imported successfully in version 2.7.9.

And in fastai_model.py line 78,

        if (self.one_hot_encode_target is True):
            # y_true_flat = one_hot_np(y_true_flat, last_output.size()[-1])
            y_true_flat = one_hot_np(y_true_flat, last_output.size()[-1])

the function one_hot_np() can not be found either.

Do you have this problem? Were you using windows or Linux?

Thank you.
Qian Li

Likelihood not considered in multi-label classification

Hi,
In you code I saw that even if the scp_codes is like this {'Norm':100.0, 'LVOLT': 0.0, 'SR':0.0}, you have taken all these labels as the correct ones for 71 way multi-label classification. Is that correct? According to what I understand the above dictionary is label:likelihood. So should not the 'SR' and 'LVOLT' be removed from the multi-label as they have likelihoods of zero? Correct me if I am wrong. Thank you.

I getting this error

OSError: Failed to interpret file '../data/ptbxl/raw100.npy' as a pickle

Bug in xresnet1d.py

This is a really interesting piece of work, and I think it will have a positive impact on the field.

I found that the implementation of some functions such as weight_norm, spectral_norm, InstanceNorm, SEModule, and SimpleSelfAttention in the xresnet1d.py file is missing. Can you provide a complete file??

Thank you sincerely.

ICBEB preprocess throws error

The line 'data = np.array([signal for signal, meta in data])' in load_raw_data_icbeb function throws error ---->ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6877,) + inhomogeneous part.

I checked the shape of the data list and they are different for each signal. Can i pad zeros to make all the shapes equal so that the numpy array to be created is consistent?

Extraction of representations from hidden layers

Dear Team,

thanks for this fantastic work, I think it will be of great value for the scientific community.

I kindly ask you to add the functionality to extract representations (i.e. the activations
of the hidden units) from a specified hidden layer of one or some of the models.

To be more specific, I think a single representation of 2.5 seconds long 12-lead ECGs would be the best option (as pointed out in our private communication).

Typically the most interesting layers are those close to the output e.g. the last or second last hidden layer, but it could be also interesting to extract activations from a generic layer, in order to make sense of how networks "solve" this problem.

Thank you again and

Best Regards!

Alessio Ansuini

About Fig5 in paper

Hi! Thanks for your elaborate work for ECG benchmarkd and dataset.

I wonder that what is the clustering results in Section IV-B (I mean what is high error cluster and low error cluster exactly? And how to get Fig5? Looking forward to your reply!

Pytorch Version

Hi， thanks for your solid and comprehensive work！
I'm a little confused about Fastai， could u please release pytorch version of such models especially the Top-Performing xresnet101-1d？
Thanks a lot!

Fault

Unable to reproduce Fmax metric for PTB-XL

I managed to train the xresnet1d101 model on the PTB-XL dataset using the reproduce_results.py script with different experiments,
however when I evaluate the model predictions, I get different results than the ones reported in Table II.

To evaluate the predictions I used the following script, where I adapted the code from the utils module to compute the optimal thresholds for the F1 score:

from pathlib import Path

import numpy as np
from sklearn.metrics import roc_auc_score
from tqdm import tqdm
from utils import utils


def find_optimal_cutoff_threshold_for_fbeta(
    target, predicted, beta, n_thresholds=100
):
    thresholds = np.linspace(0.00, 1, n_thresholds)
    scores = [
        utils.challenge_metrics(
            target, predicted > t, beta1=beta, beta2=beta, single=True
        )["F_beta_macro"]
        for t in thresholds
    ]
    optimal_idx = np.argmax(scores)
    return thresholds[optimal_idx]


def find_optimal_cutoff_thresholds_for_fbeta(y_true, y_pred, beta):
    print(f"optimize thresholds with respect to F{beta}")
    return [
        find_optimal_cutoff_threshold_for_fbeta(
            y_true[:, k][:, np.newaxis], y_pred[:, k][:, np.newaxis], beta
        )
        for k in tqdm(range(y_true.shape[1]))
    ]


beta = 1
path = Path("../output")
for exp in ["exp0", "exp1", "exp1.1", "exp1.1.1"]:
    print(f"experiment: {exp}")
    y_test = np.load(path / exp / "data" / "y_test.npy", allow_pickle=True)
    y_test_pred = np.load(
        path / exp / "models" / "fastai_xresnet1d101" / "y_test_pred.npy",
        allow_pickle=True,
    )

    thresholds = find_optimal_cutoff_thresholds_for_fbeta(
        y_test, y_test_pred, beta
    )
    y_pred_binary = utils.apply_thresholds(y_test_pred, thresholds)
    metrics = utils.challenge_metrics(
        y_test, y_pred_binary, beta1=beta, beta2=beta
    )
    metrics["macro_auc"] = roc_auc_score(y_test, y_test_pred, average="macro")
    print(metrics)

For the exp0 experiment I also evaluated the provided y_test_pred.npy file from this repository.

Using this setup I get the following results:

Experiment	Level	F1 max	Fmax (paper)
exp0	all	0.396	0.764
exp1	diag	0.392	0.736
exp1.1	sub-diag.	0.523	0.760
exp1.1.1	super-diag.	0.722	0.815

Error loading/dumping data with sampling frequency 500: File too large

When initializing the SCPExperiment with sampling_frequency=500, I observe the following error:

$ time python reproduce_results.py

Traceback (most recent call last):
  File "reproduce_results.py", line 64, in <module>
    main()
  File "reproduce_results.py", line 43, in main
    e.prepare()
  File "/home/rsaite/medalcare/ecg_ptbxl_benchmarking/code/experiments/scp_experiment.py", line 57, in prepare
    self.datafolder, self.sampling_frequency
  File "/home/rsaite/medalcare/ecg_ptbxl_benchmarking/code/utils/utils.py", line 151, in load_dataset
    X = load_raw_data_ptbxl(Y, sampling_rate, path)
  File "/home/rsaite/medalcare/ecg_ptbxl_benchmarking/code/utils/utils.py", line 196, in load_raw_data_ptbxl
    data.dump(path + "raw500.npy")
  File "/home/rsaite/anaconda3/envs/ecg_ptbxl/lib/python3.7/site-packages/numpy/core/_methods.py", line 241, in _dump
    pickle.dump(self, f, protocol=protocol)
OverflowError: cannot serialize a string larger than 4GiB

ICBEB dataset not available

Dear Team,

Thank you for your interesting work. I have been trying to work with your study, however, I am having trouble downloading the ICBEB dataset from https://hhbucket.oss-cn-hongkong.aliyuncs.com/TrainingSet1.zip, http://hhbucket.oss-cn-hongkong.aliyuncs.com/TrainingSet2.zip, and http://hhbucket.oss-cn-hongkong.aliyuncs.com/TrainingSet3.zip as the sites are not available. Are there any other alternative ways to download this dataset?

Bug in code/models/xresnet1d.py

This is a really interesting piece of work, and I think it will have a positive impact on the field.

Thank you sincerely.

wavelet ANN not working

keras and tensorflow are needed for wavelets but are not included in the environment. I tried downloading latest versions but program just hung at this point.

How to get the accuracy of 90（cry face）

Training process always crashed due to "_pickle.UnpicklingError"

Hi, I am trying to reproduce your results. However, the training process always crashes and pops out "_pickle.UnpicklingError" after several epochs.

Here is the full info. showed while the training process ceased:

`
runfile('F:/Revlis/reference/ecg_ptbxl_benchmarking-master/code/reproduce_results.py', wdir='F:/Revlis/reference/ecg_ptbxl_benchmarking-master/code')
Training from scratch...
model: fastai_xresnet1d101
█epoch train_loss valid_loss time
Epoch 1/1 : |██████████████------| 71.43% [95/133 01:02<00:25 3.0052]0 7.200954 #na# 01:03
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
epoch train_loss valid_loss time
Epoch 1/50 : █0 0.705411 0.575564 01:34
Epoch 2/50 : █1 0.539839 0.474647 01:30
Epoch 3/50 : █2 0.423681 0.374968 01:30
Epoch 4/50 : █3 0.371818 0.363357 01:29
Epoch 5/50 : Traceback (most recent call last):

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec
exec(code, globals, locals)

File "f:\revlis\reference\ecg_ptbxl_benchmarking-master\code\reproduce_results.py", line 49, in
main()

File "f:\revlis\reference\ecg_ptbxl_benchmarking-master\code\reproduce_results.py", line 29, in main
e.perform()

File "F:\Revlis\reference\ecg_ptbxl_benchmarking-master\code\experiments\scp_experiment.py", line 112, in perform
model.fit(self.X_train, self.y_train, self.X_val, self.y_val)

File "F:\Revlis\reference\ecg_ptbxl_benchmarking-master\code\models\fastai_model.py", line 236, in fit
learn.fit_one_cycle(self.epochs,self.lr)#slice(self.lr) if self.discriminative_lrs else self.lr)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\fastai\train.py", line 23, in fit_one_cycle
learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\fastai\basic_train.py", line 200, in fit
fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\fastai\basic_train.py", line 99, in fit
for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\fastprogress\fastprogress.py", line 50, in iter
raise e

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\fastprogress\fastprogress.py", line 41, in iter
for i,o in enumerate(self.gen):

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\fastai\basic_data.py", line 75, in iter
for b in self.dl: yield self.proc_batch(b)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\torch\utils\data\dataloader.py", line 441, in iter
return self._get_iterator()

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\torch\utils\data\dataloader.py", line 388, in _get_iterator
return _MultiProcessingDataLoaderIter(self)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\site-packages\torch\utils\data\dataloader.py", line 1042, in init
w.start()

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\multiprocessing\popen_spawn_win32.py", line 93, in init
reduction.dump(process_obj, to_child)

File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)

OSError: [Errno 22] Invalid argument

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\revlis_user\anaconda3\envs\env_ecg_benchmark\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
_pickle.UnpicklingError: pickle data was truncated
`

I believe this error results from multiprocessing, yet, I have no clue about the solution.

helme / ecg_ptbxl_benchmarking Goto Github PK

ecg_ptbxl_benchmarking's People

Contributors

Stargazers

Watchers

Forkers

ecg_ptbxl_benchmarking's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs