Comments (30)
please paste params used and model
from autotrain-advanced.
Hi @abhishekkrthakur , these are the details:
Task = LLM SFT
Model = mistralai/Mixtral-8x7B-Instruct-v0.1
{
"block_size": 1024,
"model_max_length": 2048,
"padding": "right",
"use_flash_attention_2": false,
"disable_gradient_checkpointing": false,
"logging_steps": -1,
"evaluation_strategy": "epoch",
"save_total_limit": 1,
"save_strategy": "epoch",
"auto_find_batch_size": false,
"mixed_precision": "fp16",
"lr": 0.00003,
"epochs": 3,
"batch_size": 2,
"warmup_ratio": 0.1,
"gradient_accumulation": 1,
"optimizer": "adamw_torch",
"scheduler": "linear",
"weight_decay": 0,
"max_grad_norm": 1,
"seed": 42,
"chat_template": "none",
"quantization": "int4",
"target_modules": "all-linear",
"merge_adapter": false,
"peft": true,
"lora_r": 16,
"lora_alpha": 32,
"lora_dropout": 0.05
}
from autotrain-advanced.
are you running it on windows? could you please tell me how you installed autotrain?
from autotrain-advanced.
I'm running it on Autotrain UI in HuggingFace spaces @abhishekkrthakur
(I chose Autotrain's docker template when building the HF space)
from autotrain-advanced.
same error, it's running on Autotrain UI, i removed "mixed_precision": "fp16"
as the space running on CPU
using google/gemma model
parameters:
{
"block_size": 1024,
"model_max_length": 2048,
"padding": "right",
"use_flash_attention_2": false,
"disable_gradient_checkpointing": false,
"logging_steps": -1,
"evaluation_strategy": "epoch",
"save_total_limit": 1,
"save_strategy": "epoch",
"auto_find_batch_size": false,
"lr": 0.00003,
"epochs": 3,
"batch_size": 2,
"warmup_ratio": 0.1,
"gradient_accumulation": 1,
"optimizer": "adamw_torch",
"scheduler": "linear",
"weight_decay": 0,
"max_grad_norm": 1,
"seed": 42,
"chat_template": "none",
"quantization": "int4",
"target_modules": "all-linear",
"merge_adapter": false,
"peft": true,
"lora_r": 16,
"lora_alpha": 32,
"lora_dropout": 0.05
}
from autotrain-advanced.
you should not remove any params. if you dont want mixed precision, set it to none:
mixed_precision: "none"
from autotrain-advanced.
Still same error
from autotrain-advanced.
taking a look!
from autotrain-advanced.
hello?
from autotrain-advanced.
have you tried after that? some packages were updated this week. please factory rebuild your autotrain space before trying it.
from autotrain-advanced.
have you tried after that? some packages were updated this week. please factory rebuild your autotrain space before trying it.
Still getting the error as of current.
from autotrain-advanced.
Still same.
from autotrain-advanced.
I'm using google/gemma-7b, will you try it.
Training Data: (data.csv)
text
"human: hello \n bot: id-chat hi nice to meet you"
"human: how are you \n bot: id-chat I am fine"
"human: generate an image of a cat \n bot: id-image a cute furry cat"
Column mapping:-
{"text": "text"}
from autotrain-advanced.
I get this same dependency issue please provide a fix
OR | 2024-03-04 11:17:08 | autotrain.trainers.common:wrapper:91 - train has failed due to an exception: Traceback (most recent call last):
File "/app/env/lib/python3.10/site-packages/autotrain/trainers/common.py", line 88, in wrapper
return func(args, kwargs)
File "/app/env/lib/python3.10/site-packages/autotrain/trainers/clm/main.py", line 230, in train
model = AutoModelForCausalLM.from_pretrained(
File "/app/env/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/app/env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3024, in from_pretrained
hf_******.validate_environment(
File "/app/env/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 62, in validate_environment
raise ImportError(
ImportError: Using bitsandbytes
8-bit quantization requires Accelerate: pip install accelerate
and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes
❌ ERROR | 2024-03-04 11:17:08 | autotrain.trainers.common:wrapper:92 - Using bitsandbytes
8-bit quantization requires Accelerate: pip install accelerate
and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes
🚀 INFO | 2024-03-04 11:17:08 | autotrain.trainers.common:pause_space:49 - Pausing space...
from autotrain-advanced.
@abhishekkrthakur I am receiving the same error when running it on google colab:
from autotrain-advanced.
@SyntaxPratyush Same here, @abhishekkrthakur can you please look into the same..
from autotrain-advanced.
I also encountered the same issue
from autotrain-advanced.
Someone says we need to downgrade transformers library to version 4.30, on order to fix this error
However, GemmerTokenizer need to upgrade transformers to version 4.38 ... !!
from autotrain-advanced.
taking a look again.
from autotrain-advanced.
I spun up a new autotrain space, added a10g gpu and i am able to train mistralai/Mistral-7B-v0.1 successfully.
do you have this issue with a specific gpu or a specific model?
from autotrain-advanced.
@abhishekkrthakur Could you please show me a detailed tutorial on how to do it on autotrain-advanced as there are no proper explanations on how to do it, I am having specific issues on finding the proper format for train.csv & the column mapping as write know I am getting Error-500: Check Logs for more Info, and the logs are empty
from autotrain-advanced.
@SyntaxPratyush here is a train.csv for llm task that you can try with: https://github.com/huggingface/autotrain-example-datasets/blob/main/alpaca1k.csv
from autotrain-advanced.
@abhishekkrthakur column mapping pls
from autotrain-advanced.
you dont need to change anything in column mapping if you use that file. also, lets not hijack this thread as its a completely different issue. you can post your queries in huggingface forums and i can help there.
from autotrain-advanced.
ok thanks
from autotrain-advanced.
from autotrain-advanced.
which gpu did you use?
from autotrain-advanced.
i have a local Radeon Pro 575 and chose the free cpu at the beginning
from autotrain-advanced.
you cannot use peft and quantization on cpu. please select appropriate gpu. e.g. A10g
from autotrain-advanced.
im closing this issue as its deviating a lot from the title and the originally reported issue doesnt exist. the error appears because users are trying to train gpu models on a cpu machine.
from autotrain-advanced.
Related Issues (20)
- start train failed HOT 1
- [BUG] autotrain app fails on start HOT 4
- Getting [Errno 13] Permission denied | autotrain.cli.run_dreambooth:run:393 - Job ID: 33508 | autotrain.trainers.common:wrapper:121 - [Errno 13] Permission denied: HOT 4
- [BUG] Valid split will trigger AttributeError: 'NoneType' object has no attribute 'map' HOT 1
- [BUG] No UI available HOT 8
- [FEATURE REQUEST]: Supporting Audio Tasks HOT 2
- [FEATURE REQUEST] Extend AutoTrain API with Start Training Request
- [FEATURE REQUEST] Add W&B Logging to AutoTrain UI and API
- Training process crashes suddenly HOT 1
- Autotrain CLI suddenly crashes HOT 3
- [BUG] Config name is missing for Datasets with no default config HOT 1
- [BUG] AutoTrain_LLM.ipynb quantization = none
- [BUG] Error when running autotrain CLI from huggingface - unrecognized arguments HOT 1
- [Push to Hub fails for local data paths] HOT 9
- [BUG] Distant computer (4 GPUs 10GiB of VRAM each) crashes the second i launch the finetuning using AutoTrain localy of mistral 7B HOT 8
- [BUG] Invalid YAML in README.md error after fine tuning is completed HOT 1
- [BUG] Autotrain fails due to missing metadata.jsonl, after removing my metadata.jsonl file. HOT 1
- [BUG] ObjectDetectionParams for uploaded Archive.zip has null data_path HOT 4
- [BUG] Autotrain Object Detection Error: KeyError: 'autotrain_label' HOT 13
- [BUG] Object Detection AutoTrain Error: iteration over a 0-d tensor HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autotrain-advanced.