Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Also <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

error when training with multiple GPUs : AttributeError: Can't pickle local object 'ExtractiveSummarizer.prepare_data.<locals>.longformer_modifier' about transformersum HOT 7 CLOSED

hhousen commented on May 26, 2024

error when training with multiple GPUs : AttributeError: Can't pickle local object 'ExtractiveSummarizer.prepare_data..longformer_modifier'

from transformersum.

Comments (7)

HHousen commented on May 26, 2024 1

It's good to hear that you've started training! I've seen this type of error before during abstractive summarization and I should be able to fix it relatively quickly. I'll have a fix in the next few days.

from transformersum.

moyid commented on May 26, 2024 1

@HHousen - it's training!

from transformersum.

moyid commented on May 26, 2024 1

sounds good, I'll try that.

from transformersum.

HHousen commented on May 26, 2024

@moyid I believe that commit 597bc9d should fix the issue. Let me know if it works now.

from transformersum.

moyid commented on May 26, 2024

@HHousen - sorry one more issue -- I think my performance is slow because of num_workers -- I tried setting it through dataloader_num_workers based on the documentation but got error main.py: error: unrecognized arguments: --dataloader_num_workers 1

from transformersum.

HHousen commented on May 26, 2024

@moyid The --dataloader_num_workers argument works only for abstractive summarization. The reason you cannot change this option for extractive summarization is because the DataLoaders are created from torch.utils.data.IterableDatasets. IterableDatasets replicate the same dataset object on each worker process. Thus, the replicas must be configured differently to avoid duplicated data. See the PyTorch documentation description of iterable style datasets and the IterableDataset docstring for more information.

The docstring gives two examples of how to split an IterableDataset workload across all workers. However, I have not implemented this into the library. Ideally, I would simply use a normal Dataset but I'm not certain how to use this properly since the entire dataset cannot be loaded into memory at once. I potentially could use Apache Arrow.

I was looking at how the huggingface/transformers seq2seq example deals with this problem. They use the Dataset class instead of IterableDataset by using the built-in python module linecache, which I have not heard of before. Implementing this will be a significant refactoring of the library's extractive data loading code. I have opened a new issue for it (#27). The linecache module looks promising.

from transformersum.

HHousen commented on May 26, 2024

Also @moyid, to determine if the number of workers is actually the problem you can train with the --profiler argument. This will output how long certain functions took to run once training is complete. To speed up training you can only train on a percentage of the dataset using the --overfit_pct argument (for example, --overfit_pct 0.001 for 0.1% of the data) You can find more info on the pytorch-lightning profiler documentation.

from transformersum.

Recommend Projects

error when training with multiple GPUs : AttributeError: Can't pickle local object 'ExtractiveSummarizer.prepare_data.<locals>.longformer_modifier' about transformersum HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs