Comments (4)
I tried to change it, now I'm getting this error
trainer = transformers.Trainer(
model=model,
train_dataset=tokenized_train_dataset,
eval_dataset=tokenized_val_dataset if "validation" in tokenized_data else None, # Use if validation exists
args=training_args,
data_collator=data_collator
)
typeError Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/accelerate/utils/operations.py:155, in send_to_device(tensor, device, non_blocking, skip_keys)
154 try:
--> 155 return tensor.to(device, non_blocking=non_blocking)
156 except TypeError: # .to() doesn't accept non_blocking as kwarg
TypeError: BatchEncoding.to() got an unexpected keyword argument 'non_blocking'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
Cell In[29], line 25
23 # train model
24 model.config.use_cache = False # silence the warnings. Please re-enable for inference!
---> 25 trainer.train()
File /opt/conda/lib/python3.10/site-packages/transformers/trainer.py:1780, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1778 hf_hub_utils.enable_progress_bars()
1779 else:
-> 1780 return inner_training_loop(
1781 args=args,
1782 resume_from_checkpoint=resume_from_checkpoint,
1783 trial=trial,
1784 ignore_keys_for_eval=ignore_keys_for_eval,
1785 )
File /opt/conda/lib/python3.10/site-packages/transformers/trainer.py:2085, in Trainer._inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
2082 rng_to_sync = True
2084 step = -1
-> 2085 for step, inputs in enumerate(epoch_iterator):
2086 total_batched_samples += 1
2088 if self.args.include_num_input_tokens_seen:
File /opt/conda/lib/python3.10/site-packages/accelerate/data_loader.py:461, in DataLoaderShard.__iter__(self)
458 try:
459 # But we still move it to the device so it is done before `StopIteration` is reached
460 if self.device is not None:
--> 461 current_batch = send_to_device(current_batch, self.device)
462 next_batch = next(dataloader_iter)
463 if batch_index >= self.skip_batches:
File /opt/conda/lib/python3.10/site-packages/accelerate/utils/operations.py:157, in send_to_device(tensor, device, non_blocking, skip_keys)
155 return tensor.to(device, non_blocking=non_blocking)
156 except TypeError: # .to() doesn't accept non_blocking as kwarg
--> 157 return tensor.to(device)
158 except AssertionError as error:
159 # `torch.Tensor.to(<int num>)` is not supported by `torch_npu` (see this [issue](https://github.com/Ascend/pytorch/issues/16)).
160 # This call is inside the try-block since is_npu_available is not supported by torch.compile.
161 if is_npu_available():
File /opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:800, in BatchEncoding.to(self, device)
796 # This check catches things like APEX blindly calling "to" on all inputs to a module
797 # Otherwise it passes the casts down and casts the LongTensor containing the token idxs
798 # into a HalfTensor
799 if isinstance(device, str) or is_torch_device(device) or isinstance(device, int):
--> 800 self.data = {k: v.to(device=device) for k, v in self.data.items()}
801 else:
802 logger.warning(f"Attempting to cast a BatchEncoding to type {str(device)}. This is not supported.")
File /opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:800, in <dictcomp>(.0)
796 # This check catches things like APEX blindly calling "to" on all inputs to a module
797 # Otherwise it passes the casts down and casts the LongTensor containing the token idxs
798 # into a HalfTensor
799 if isinstance(device, str) or is_torch_device(device) or isinstance(device, int):
--> 800 self.data = {k: v.to(device=device) for k, v in self.data.items()}
801 else:
802 logger.warning(f"Attempting to cast a BatchEncoding to type {str(device)}. This is not supported.")
AttributeError: 'numpy.ndarray' object has no attribute 'to'
from autogptq.
Try using transformer==4.38.2 if not already. There are like crazy amount of fixes on tokenization between each release.
from autogptq.
transformer==4.38.2
Try using transformer==4.38.2 if not already. There are like crazy amount of fixes on tokenization between each release.
Im using the newest transformer every time I'm running the gpu since im running it on the cloud
im using 4.392
from autogptq.
tokenizer.pad_token = tokenizer.eos_token
Data collator compatible with SciBERT's output
tokenizer.add_special_tokens({'pad_token': '[PAD]'})
tokenized_train_dataset = tokenized_data["train"] # Access the "train" split
... (rest of your code for defining model, training arguments, etc.)
Ensure data is converted to PyTorch tensors (if necessary)
def my_data_collator(features):
"""
Converts features (assumed to be tokenized sequences) to a dictionary expected by SciBERT model.
Ensures PyTorch tensors are used (if necessary).
Handles potential empty batches by raising an exception.
"""
batch = []
max_len = max(len(f["input_ids"]) for f in features) # Find maximum sequence length
for feature in features:
# Pad shorter sequences with your preferred padding token (e.g., '[PAD]')
padded_sequence = feature["input_ids"] + ([tokenizer.pad_token] * (max_len - len(feature["input_ids"])))
batch.append({
"input_ids": padded_sequence,
"attention_mask": [1] * len(padded_sequence) # Create attention mask
})
if not batch: # Check for empty batch
raise ValueError("Encountered empty batch during training. Check data preprocessing.")
Convert to PyTorch tensors (if needed)
... (same as before)
return batch
data_collator = my_data_collator # Use your custom data collator
data_collator = my_data_collator # Use your custom data collator
... (rest of your code)
Create the Trainer (without training_config
):
trainer = transformers.Trainer(
model=model,
train_dataset=tokenized_train_dataset,
eval_dataset=tokenized_val_dataset, # Use validation dataset if available
args=training_args,
data_collator=data_collator # Use your custom data collator
)
Train model
model.config.use_cache = False # Silence warnings
trainer.train()TypeError Traceback (most recent call last)
Cell In[40], line 56
54 # Train model
55 model.config.use_cache = False # Silence warnings
---> 56 trainer.train()
File /opt/conda/lib/python3.10/site-packages/transformers/trainer.py:1780, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1778 hf_hub_utils.enable_progress_bars()
1779 else:
-> 1780 return inner_training_loop(
1781 args=args,
1782 resume_from_checkpoint=resume_from_checkpoint,
1783 trial=trial,
1784 ignore_keys_for_eval=ignore_keys_for_eval,
1785 )
File /opt/conda/lib/python3.10/site-packages/transformers/trainer.py:2118, in Trainer._inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
2115 self.control = self.callback_handler.on_step_begin(args, self.state, self.control)
2117 with self.accelerator.accumulate(model):
-> 2118 tr_loss_step = self.training_step(model, inputs)
2120 if (
2121 args.logging_nan_inf_filter
2122 and not is_torch_xla_available()
2123 and (torch.isnan(tr_loss_step) or torch.isinf(tr_loss_step))
...
)
(lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)
)
) argument after ** must be a mapping, not list
from autogptq.
Related Issues (20)
- Llama-3 8B Instruct quantized to 8 Bit spits out gibberish in transformers `model.generate()` but works fine in vLLM? HOT 5
- [BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
- [Question] Differences in quantization logic compared to AWQ
- [FEATURE] ADD SUPPORT DeepSeek-V2 HOT 1
- [BUG] ARM installation error
- [BUG] ROCm installation and building broken
- Target modules [] not found in the base model. Please check the target modules and try again.
- [BUG] Cannot install from source
- [BUG] Following the quant_with_alpaca.py example but keep getting "You shouldn't move a model that is dispatched using accelerate hooks." and the model is never saved. HOT 2
- [FEATURE] Models that support MOE do GPTQ
- [FEATURE] Add marlin24 support
- How to select between different kernels?
- Question about data shape difference between quantization and forward
- [FEATURE] Added code support to 5,6,7 bits quantization can you please add me as contributor I will create a new pull request HOT 4
- [BUG] Quantitative model Yi-1.5-9b-16K does not produce text output.
- How to install auto-gptq in GCC 8.5.0 environment?
- How to get a dequantized model?
- [BUG] Not able to install on Ubuntu 22.04 (subprocess-exited-with-error )
- [BUG]
- [FEATURE] ChatGLM Support Added
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autogptq.