Describe the bug For some reason the following code gives me the

[BUG] The specified pointer resides on host memory and is not registered with any CUDA device. about deepspeed HOT 1 OPEN

La1c commented on July 22, 2024

[BUG] The specified pointer resides on host memory and is not registered with any CUDA device.

from deepspeed.

Comments (1)

La1c commented on July 22, 2024

I played around with it a bit more and the funny thing is that the following code works just fine.
The main difference is the order of operations: encoding "candidates" first and "queries" later.

candidates_dataset = Dataset.from_dict({'text': texts})
candidates_dataset = candidates_dataset.map(
        lambda x: tokenizer(x['text'],
                            padding=False,
                            max_length=512,
                            truncation=True),
        batched=True, remove_columns=['text']
    )
candidates_dataset.set_format(type='torch', columns=['input_ids',
                                                      'attention_mask',
                                                      'token_type_ids'])
tokenized_query = tokenizer(query,
                            padding=False,
                            max_length=512,
                            truncation=True,
                            return_tensors='pt')

data_collator = DataCollatorWithPadding(tokenizer, return_tensors='pt', padding=True)

dataloader = torch.utils.data.DataLoader(candidates_dataset,
                                          batch_size=8,
                                          collate_fn=data_collator,
                                          pin_memory=True
                                          )

## Here is the main difference with above: process inputs from data loader first and the query after that.
with torch.no_grad():
  for batch_texts in dataloader:
      batch_input = {
            'input_ids': batch_texts['input_ids'].to('cuda'),
            'token_type_ids': batch_texts['token_type_ids'].to('cuda'),
            'attention_mask': batch_texts['attention_mask'].to('cuda')
        }
      encoded = model_ds(**batch_input)

  query_input = {
        'input_ids': tokenized_query['input_ids'].to('cuda'),
        'token_type_ids': tokenized_query['token_type_ids'].to('cuda'),
        'attention_mask': tokenized_query['attention_mask'].to('cuda')
    }
  query_output = model_ds(**query_input)

I have no idea, why it works this way, so any comment on the issue would be really appreciated.

from deepspeed.

[BUG] The specified pointer resides on host memory and is not registered with any CUDA device. about deepspeed HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs