Comments (16)
So it seems the Seq2Seq in Google models such as t5, mt5 etc are limited to 20 tokens output due to this? ie the required params are not passed through?
from autotrain-advanced.
no. thats just validation. inferencce doesnt matter
from autotrain-advanced.
no. thats just validation. inferencce doesnt matter
So any idea why the inference output is always 20 tokens while when I train using Bart I get 256+?
from autotrain-advanced.
Seems as though the Seq2Seq args aren't passing thorugh (especially for google models).
from autotrain-advanced.
taking a look!
from autotrain-advanced.
Full trace attached. The model is still generating only max 20 tokens:
[```
nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Package punkt is already up-to-date!
WARNING Parameters not supplied by user and set to default: push_to_hub, model_ref, auto_find_batch_size, add_eos_token, data_path, lr, project_name, disable_gradient_checkpointing, logging_steps, optimizer, token, seed, lora_dropout, lora_r, rejected_text_column, batch_size, prompt_text_column, model_max_length, weight_decay, max_grad_norm, merge_adapter, gradient_accumulation, use_flash_attention_2, scheduler, valid_split, trainer, text_column, username, repo_id, lora_alpha, model, save_strategy, warmup_ratio, evaluation_strategy, save_total_limit, train_split, dpo_beta
WARNING Parameters not supplied by user and set to default: batch_size, epochs, log, weight_decay, max_grad_norm, auto_find_batch_size, max_seq_length, gradient_accumulation, scheduler, data_path, lr, valid_split, text_column, username, project_name, target_column, repo_id, logging_steps, optimizer, token, model, save_strategy, seed, warmup_ratio, save_total_limit, evaluation_strategy, push_to_hub, train_split
WARNING Parameters not supplied by user and set to default: batch_size, epochs, log, weight_decay, max_grad_norm, auto_find_batch_size, gradient_accumulation, scheduler, data_path, lr, username, valid_split, image_column, project_name, repo_id, target_column, logging_steps, optimizer, token, model, save_strategy, seed, warmup_ratio, save_total_limit, evaluation_strategy, push_to_hub, train_split
WARNING Parameters supplied but not used: target_modules
WARNING Parameters not supplied by user and set to default: epochs, auto_find_batch_size, data_path, lr, project_name, logging_steps, token, optimizer, seed, lora_dropout, lora_r, target_modules, max_target_length, batch_size, weight_decay, max_grad_norm, max_seq_length, gradient_accumulation, scheduler, username, valid_split, text_column, target_column, repo_id, lora_alpha, model, save_strategy, warmup_ratio, evaluation_strategy, save_total_limit, train_split, peft, push_to_hub, quantization
WARNING Parameters not supplied by user and set to default: id_column, categorical_columns, num_trials, numerical_columns, data_path, username, valid_split, repo_id, project_name, task, token, target_columns, model, seed, train_split, push_to_hub, time_limit
WARNING Parameters not supplied by user and set to default: resume_from_checkpoint, epochs, lr_power, tokenizer_max_length, validation_images, adam_beta1, num_cycles, num_class_images, project_name, token, pre_compute_text_embeddings, sample_batch_size, allow_tf32, xl, num_validation_images, seed, scale_lr, validation_epochs, checkpoints_total_limit, class_prompt, revision, rank, adam_weight_decay, prior_preservation, class_image_path, prior_loss_weight, max_grad_norm, adam_epsilon, scheduler, username, tokenizer, text_encoder_use_attention_mask, image_path, dataloader_num_workers, repo_id, class_labels_conditioning, prior_generation_precision, model, adam_beta2, validation_prompt, local_rank, checkpointing_steps, center_crop, push_to_hub, logging, warmup_steps
WARNING Parameters not supplied by user and set to default: tags_column, batch_size, epochs, log, tokens_column, weight_decay, max_grad_norm, auto_find_batch_size, max_seq_length, gradient_accumulation, scheduler, data_path, lr, valid_split, username, repo_id, project_name, logging_steps, optimizer, token, model, save_strategy, seed, warmup_ratio, save_total_limit, evaluation_strategy, push_to_hub, train_split
INFO AutoTrain Public URL: NgrokTunnel: "https://b320-34-72-237-89.ngrok-free.app/" -> "http://localhost:7860/"
INFO Please wait for the app to load...
INFO ***
INFO: Started server process [7599]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:7860/ (Press CTRL+C to quit)
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET / HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /logo.png HTTP/1.1" 200 OK
INFO Task: llm:sft
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /params/llm%3Asft HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /model_choices/llm%3Asft HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /favicon.ico HTTP/1.1" 404 Not Found
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Task: seq2seq
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /params/seq2seq HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /model_choices/seq2seq HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO hardware: Local
INFO Running jobs: []
INFO Task: seq2seq
INFO Column mapping: {'text': 'text', 'label': 'target'}
INFO Dataset: autotrain-gsxqu-k795g (seq2seq)
Train data: [<tempfile.SpooledTemporaryFile object at 0x7e16bada5b40>]
Valid data: []
Column mapping: {'text': 'text', 'label': 'target'}
Saving the dataset (1/1 shards): 100% 800/800 [00:00<00:00, 238160.49 examples/s]
Saving the dataset (1/1 shards): 100% 200/200 [00:00<00:00, 86072.32 examples/s]
WARNING Parameters not supplied by user and set to default: train_split
WARNING Parameters supplied but not used: model_max_length, max_length, max_new_tokens
INFO Starting local training...
INFO {"data_path":"autotrain-gsxqu-k795g/autotrain-data","model":"google-t5/t5-base","username":"tombenj","seed":42,"train_split":"train","valid_split":"validation","project_name":"autotrain-gsxqu-k795g","token":"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","push_to_hub":true,"text_column":"autotrain_text","target_column":"autotrain_label","repo_id":"tombenj/autotrain-gsxqu-k795g","lr":0.00005,"epochs":1,"max_seq_length":1024,"max_target_length":1024,"batch_size":8,"warmup_ratio":0.1,"gradient_accumulation":1,"optimizer":"adamw_torch","scheduler":"linear","weight_decay":0.0,"max_grad_norm":1.0,"logging_steps":-1,"evaluation_strategy":"epoch","auto_find_batch_size":false,"mixed_precision":"fp16","save_total_limit":1,"save_strategy":"epoch","peft":false,"quantization":null,"lora_r":16,"lora_alpha":32,"lora_dropout":0.05,"target_modules":["all-linear"]}
INFO ['accelerate', 'launch', '--num_machines', '1', '--num_processes', '1', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.seq2seq', '--training_config', 'autotrain-gsxqu-k795g/training_params.json']
INFO Training PID: 7899
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "POST /create_project HTTP/1.1" 200 OK
The following values were not passed toaccelerate launch
and had defaults used instead:
--dynamo_backend
was set to a value of'no'
To avoid this warning pass in values for each of the problematic parameters or runaccelerate config
.
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
🚀 INFO | 2024-03-13 09:15:12 | main🚋45 - Starting training...
🚀 INFO | 2024-03-13 09:15:12 | main🚋46 - Training config: {'data_path': 'autotrain-gsxqu-k795g/autotrain-data', 'model': 'google-t5/t5-base', 'username': 'tombenj', 'seed': 42, 'train_split': 'train', 'valid_split': 'validation', 'project_name': 'autotrain-gsxqu-k795g', 'token': '*****', 'push_to_hub': True, 'text_column': 'autotrain_text', 'target_column': 'autotrain_label', 'repo_id': 'tombenj/autotrain-gsxqu-k795g', 'lr': 5e-05, 'epochs': 1, 'max_seq_length': 1024, 'max_target_length': 1024, 'batch_size': 8, 'warmup_ratio': 0.1, 'gradient_accumulation': 1, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'logging_steps': -1, 'evaluation_strategy': 'epoch', 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'save_total_limit': 1, 'save_strategy': 'epoch', 'peft': False, 'quantization': None, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'target_modules': ['all-linear']}
🚀 INFO | 2024-03-13 09:15:12 | main🚋53 - loading dataset from disk
🚀 INFO | 2024-03-13 09:15:12 | main🚋64 - loading dataset from disk
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
/usr/local/lib/python3.10/dist-packages/transformers/models/t5/tokenization_t5_fast.py:171: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding withtruncation is True
.
- Be aware that you SHOULD NOT rely on google-t5/t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with
model_max_length
or passmax_length
when encoding/padding. - To avoid this warning, please instantiate this tokenizer with
model_max_length
set to your preferred value.
warnings.warn(
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
{'loss': 10.2023, 'grad_norm': 29.899457931518555, 'learning_rate': 1.5e-05, 'epoch': 0.05}
8% 8/100 [00:04<00:39, 2.33it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 6.8607, 'grad_norm': 57.67634963989258, 'learning_rate': 3.5e-05, 'epoch': 0.1}
{'loss': 2.1809, 'grad_norm': 4.584061622619629, 'learning_rate': 4.888888888888889e-05, 'epoch': 0.15}
{'loss': 1.2057, 'grad_norm': 3.255554437637329, 'learning_rate': 4.6111111111111115e-05, 'epoch': 0.2}
20% 20/100 [00:09<00:34, 2.35it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.8056, 'grad_norm': 2.723172426223755, 'learning_rate': 4.3333333333333334e-05, 'epoch': 0.25}
{'loss': 0.5849, 'grad_norm': 1.7794570922851562, 'learning_rate': 4.055555555555556e-05, 'epoch': 0.3}
32% 32/100 [00:14<00:28, 2.36it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.418, 'grad_norm': 1.8489391803741455, 'learning_rate': 3.777777777777778e-05, 'epoch': 0.35}
{'loss': 0.3179, 'grad_norm': 1.1671098470687866, 'learning_rate': 3.5e-05, 'epoch': 0.4}
44% 44/100 [00:19<00:23, 2.36it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.2652, 'grad_norm': 1.281832218170166, 'learning_rate': 3.222222222222223e-05, 'epoch': 0.45}
{'loss': 0.2405, 'grad_norm': 0.9086970686912537, 'learning_rate': 2.9444444444444448e-05, 'epoch': 0.5}
{'loss': 0.2329, 'grad_norm': 1.1303473711013794, 'learning_rate': 2.6666666666666667e-05, 'epoch': 0.55}
55% 55/100 [00:24<00:19, 2.32it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.2199, 'grad_norm': 1.0673601627349854, 'learning_rate': 2.3888888888888892e-05, 'epoch': 0.6}
{'loss': 0.2137, 'grad_norm': 0.8874663710594177, 'learning_rate': 2.111111111111111e-05, 'epoch': 0.65}
67% 67/100 [00:29<00:14, 2.35it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.1876, 'grad_norm': 0.7264275550842285, 'learning_rate': 1.8333333333333333e-05, 'epoch': 0.7}
{'loss': 0.1834, 'grad_norm': 0.8036168217658997, 'learning_rate': 1.5555555555555555e-05, 'epoch': 0.75}
78% 78/100 [00:34<00:09, 2.30it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.1952, 'grad_norm': 0.6779347062110901, 'learning_rate': 1.2777777777777777e-05, 'epoch': 0.8}
{'loss': 0.1827, 'grad_norm': 0.7915838360786438, 'learning_rate': 1e-05, 'epoch': 0.85}
{'loss': 0.187, 'grad_norm': 0.8215489387512207, 'learning_rate': 7.222222222222222e-06, 'epoch': 0.9}
{'loss': 0.1734, 'grad_norm': 0.7938928604125977, 'learning_rate': 4.444444444444445e-06, 'epoch': 0.95}
{'loss': 0.172, 'grad_norm': 0.6198846697807312, 'learning_rate': 1.6666666666666667e-06, 'epoch': 1.0}
100% 100/100 [00:43<00:00, 2.36it/s]/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1178: UserWarning: Using the model-agnostic defaultmax_length
(=20) to control the generation length. We recommend settingmax_new_tokens
to control the maximum length of the generation.
warnings.warn(
0% 0/13 [00:00<?, ?it/s]
15% 2/13 [00:00<00:05, 2.07it/s]
23% 3/13 [00:02<00:07, 1.36it/s]
31% 4/13 [00:02<00:07, 1.25it/s]
38% 5/13 [00:03<00:06, 1.23it/s]
46% 6/13 [00:04<00:05, 1.20it/s]
54% 7/13 [00:05<00:05, 1.19it/s]
62% 8/13 [00:06<00:04, 1.18it/s]
69% 9/13 [00:07<00:03, 1.17it/s]
77% 10/13 [00:08<00:02, 1.16it/s]
85% 11/13 [00:09<00:01, 1.15it/s]
92% 12/13 [00:09<00:00, 1.15it/s]
{'eval_loss': 0.1557321548461914, 'eval_rouge1': 14.9506, 'eval_rouge2': 12.1047, 'eval_rougeL': 14.938, 'eval_rougeLsum': 14.9251, 'eval_gen_len': 19.0, 'eval_runtime': 12.0496, 'eval_samples_per_second': 16.598, 'eval_steps_per_second': 1.079, 'epoch': 1.0}
100% 100/100 [00:55<00:00, 2.36it/s]
100% 13/13 [00:11<00:00, 1.25it/s]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
There were missing keys in the checkpoint model loaded: ['encoder.embed_tokens.weight', 'decoder.embed_tokens.weight', 'lm_head.weight'].
{'train_runtime': 78.2901, 'train_samples_per_second': 10.218, 'train_steps_per_second': 1.277, 'train_loss': 1.2514769697189332, 'epoch': 1.0}
100% 100/100 [01:18<00:00, 1.28it/s]
🚀 INFO | 2024-03-13 09:16:38 | main🚋204 - Finished training, saving model...
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1178: UserWarning: Using the model-agnostic defaultmax_length
(=20) to control the generation length. We recommend settingmax_new_tokens
to control the maximum length of the generation.
warnings.warn(
15% 2/13 [00:01<00:05, 1.92it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
54% 7/13 [00:05<00:05, 1.19it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
100% 13/13 [00:11<00:00, 1.18it/s]
🚀 INFO | 2024-03-13 09:16:54 | main🚋218 - Pushing model to hub...
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
model.safetensors: 0% 0.00/892M [00:00<?, ?B/s]
rng_state.pth: 0% 0.00/14.2k [00:00<?, ?B/s]
optimizer.pt: 0% 0.00/1.78G [00:00<?, ?B/s]
spiece.model: 0% 0.00/792k [00:00<?, ?B/s]
Upload 11 LFS files: 0% 0/11 [00:00<?, ?it/s]
scheduler.pt: 0% 0.00/1.06k [00:00<?, ?B/s]
optimizer.pt: 0% 16.4k/1.78G [00:00<10:59:15, 45.1kB/s]
spiece.model: 2% 16.4k/792k [00:00<00:17, 45.2kB/s]
model.safetensors: 0% 16.4k/892M [00:00<5:48:26, 42.6kB/s]
scheduler.pt: 100% 1.06k/1.06k [00:00<00:00, 2.36kB/s]
rng_state.pth: 100% 14.2k/14.2k [00:00<00:00, 28.2kB/s]
spiece.model: 100% 792k/792k [00:00<00:00, 1.16MB/s]
optimizer.pt: 1% 16.0M/1.78G [00:00<01:03, 28.0MB/s]
optimizer.pt: 1% 25.4M/1.78G [00:00<00:41, 42.6MB/s]
model.safetensors: 2% 16.0M/892M [00:00<00:39, 22.0MB/s]
training_args.bin: 100% 5.05k/5.05k [00:00<00:00, 86.1kB/s]
model.safetensors: 3% 22.9M/892M [00:01<00:27, 31.1MB/s]
events.out.tfevents.1710321319.a7547a852de5.7919.0: 100% 10.6k/10.6k [00:00<00:00, 91.3kB/s]
events.out.tfevents.1710321414.a7547a852de5.7919.1: 0% 0.00/603 [00:00<?, ?B/s]
model.safetensors: 1% 8.21M/892M [00:00<00:25, 35.0MB/s]
events.out.tfevents.1710321414.a7547a852de5.7919.1: 100% 603/603 [00:00<00:00, 5.20kB/s]
model.safetensors: 2% 14.6M/892M [00:00<00:19, 45.0MB/s]
spiece.model: 0% 0.00/792k [00:00<?, ?B/s]
model.safetensors: 4% 32.0M/892M [00:01<00:34, 25.0MB/s]
model.safetensors: 2% 19.2M/892M [00:00<00:29, 29.3MB/s]
spiece.model: 100% 792k/792k [00:00<00:00, 2.65MB/s]
model.safetensors: 5% 40.8M/892M [00:01<00:25, 33.7MB/s]
training_args.bin: 100% 5.05k/5.05k [00:00<00:00, 65.7kB/s]
model.safetensors: 3% 24.2M/892M [00:00<00:27, 31.7MB/s]
model.safetensors: 5% 46.4M/892M [00:01<00:23, 35.8MB/s]
model.safetensors: 3% 28.2M/892M [00:00<00:25, 33.6MB/s]
optimizer.pt: 3% 61.0M/1.78G [00:01<00:41, 41.5MB/s]
model.safetensors: 6% 51.0M/892M [00:01<00:27, 30.7MB/s]
model.safetensors: 7% 62.1M/892M [00:02<00:18, 45.0MB/s]
optimizer.pt: 4% 73.6M/1.78G [00:02<00:41, 41.7MB/s]
model.safetensors: 8% 68.1M/892M [00:02<00:26, 31.5MB/s]
model.safetensors: 4% 39.4M/892M [00:01<00:44, 19.1MB/s]
model.safetensors: 9% 79.0M/892M [00:02<00:19, 42.3MB/s]
optimizer.pt: 5% 87.1M/1.78G [00:02<00:47, 35.9MB/s]
model.safetensors: 5% 44.3M/892M [00:01<00:36, 23.2MB/s]
model.safetensors: 10% 84.9M/892M [00:02<00:24, 32.9MB/s]
model.safetensors: 11% 94.4M/892M [00:02<00:19, 41.0MB/s]
optimizer.pt: 6% 101M/1.78G [00:02<00:46, 36.2MB/s]
model.safetensors: 6% 54.1M/892M [00:02<00:32, 25.7MB/s]
optimizer.pt: 6% 107M/1.78G [00:03<00:42, 39.8MB/s]
model.safetensors: 11% 100M/892M [00:03<00:24, 33.0MB/s]
model.safetensors: 13% 112M/892M [00:03<00:17, 45.7MB/s]
model.safetensors: 13% 118M/892M [00:03<00:21, 35.5MB/s]
model.safetensors: 14% 127M/892M [00:03<00:17, 43.4MB/s]
optimizer.pt: 7% 128M/1.78G [00:03<00:50, 32.6MB/s]
model.safetensors: 8% 72.9M/892M [00:02<00:33, 24.5MB/s]
optimizer.pt: 8% 135M/1.78G [00:03<00:42, 38.6MB/s]
model.safetensors: 15% 133M/892M [00:04<00:23, 32.3MB/s]
model.safetensors: 9% 80.0M/892M [00:03<00:34, 23.6MB/s]
optimizer.pt: 8% 147M/1.78G [00:04<00:45, 35.8MB/s]
model.safetensors: 10% 88.4M/892M [00:03<00:25, 31.3MB/s]
model.safetensors: 16% 144M/892M [00:04<00:20, 36.0MB/s]
model.safetensors: 11% 95.0M/892M [00:03<00:21, 36.8MB/s]
model.safetensors: 18% 160M/892M [00:04<00:15, 45.9MB/s]
model.safetensors: 11% 100M/892M [00:03<00:26, 30.3MB/s] INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
optimizer.pt: 9% 162M/1.78G [00:04<00:52, 30.8MB/s]
model.safetensors: 12% 111M/892M [00:03<00:17, 44.5MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
model.safetensors: 19% 165M/892M [00:04<00:20, 36.3MB/s]
model.safetensors: 19% 172M/892M [00:05<00:18, 39.3MB/s]
model.safetensors: 14% 124M/892M [00:04<00:19, 40.3MB/s]
model.safetensors: 20% 177M/892M [00:05<00:21, 33.7MB/s]
model.safetensors: 21% 187M/892M [00:05<00:15, 46.4MB/s]
model.safetensors: 14% 129M/892M [00:04<00:22, 34.2MB/s]
optimizer.pt: 11% 190M/1.78G [00:05<00:41, 38.1MB/s]
model.safetensors: 15% 135M/892M [00:04<00:20, 36.4MB/s]
model.safetensors: 22% 193M/892M [00:05<00:19, 35.6MB/s]
model.safetensors: 23% 204M/892M [00:05<00:14, 47.1MB/s]
optimizer.pt: 11% 201M/1.78G [00:05<00:42, 37.3MB/s]
model.safetensors: 17% 149M/892M [00:04<00:21, 35.1MB/s]
model.safetensors: 24% 210M/892M [00:05<00:16, 41.0MB/s]
model.safetensors: 17% 155M/892M [00:05<00:19, 38.1MB/s]
model.safetensors: 24% 215M/892M [00:06<00:15, 42.6MB/s]
model.safetensors: 25% 221M/892M [00:06<00:15, 43.3MB/s]
model.safetensors: 18% 160M/892M [00:05<00:22, 33.0MB/s]
model.safetensors: 18% 164M/892M [00:05<00:21, 34.2MB/s]
optimizer.pt: 12% 222M/1.78G [00:06<00:39, 40.0MB/s]
model.safetensors: 19% 172M/892M [00:05<00:16, 44.2MB/s]
model.safetensors: 25% 226M/892M [00:06<00:24, 27.1MB/s]
model.safetensors: 20% 177M/892M [00:05<00:20, 35.6MB/s]
model.safetensors: 26% 232M/892M [00:06<00:21, 30.7MB/s]
model.safetensors: 27% 237M/892M [00:06<00:19, 32.8MB/s]
model.safetensors: 21% 190M/892M [00:05<00:16, 42.8MB/s]
model.safetensors: 27% 241M/892M [00:07<00:22, 29.1MB/s]
model.safetensors: 22% 195M/892M [00:06<00:18, 37.1MB/s]
model.safetensors: 28% 247M/892M [00:07<00:19, 33.4MB/s]
model.safetensors: 28% 252M/892M [00:07<00:17, 36.9MB/s]
model.safetensors: 23% 207M/892M [00:06<00:16, 41.6MB/s]
optimizer.pt: 15% 260M/1.78G [00:07<00:42, 35.8MB/s]
model.safetensors: 29% 256M/892M [00:07<00:21, 29.4MB/s]
model.safetensors: 30% 264M/892M [00:07<00:16, 37.3MB/s]
model.safetensors: 30% 271M/892M [00:07<00:14, 43.1MB/s]
model.safetensors: 25% 223M/892M [00:06<00:15, 41.9MB/s]
optimizer.pt: 15% 276M/1.78G [00:07<00:42, 35.1MB/s]
optimizer.pt: 16% 287M/1.78G [00:07<00:32, 46.5MB/s]
model.safetensors: 31% 276M/892M [00:08<00:20, 30.1MB/s]
model.safetensors: 32% 282M/892M [00:08<00:17, 34.5MB/s]
optimizer.pt: 16% 294M/1.78G [00:08<00:39, 37.4MB/s]
optimizer.pt: 17% 301M/1.78G [00:08<00:34, 43.0MB/s]
model.safetensors: 27% 244M/892M [00:07<00:18, 34.7MB/s]
model.safetensors: 32% 288M/892M [00:08<00:22, 27.3MB/s]
model.safetensors: 34% 300M/892M [00:08<00:13, 42.9MB/s]
optimizer.pt: 17% 312M/1.78G [00:08<00:39, 36.8MB/s]
model.safetensors: 29% 262M/892M [00:07<00:16, 38.4MB/s]
model.safetensors: 34% 306M/892M [00:08<00:16, 35.2MB/s]
model.safetensors: 35% 313M/892M [00:08<00:14, 39.8MB/s]
optimizer.pt: 18% 325M/1.78G [00:09<00:42, 34.5MB/s]
model.safetensors: 31% 275M/892M [00:08<00:15, 39.0MB/s]
optimizer.pt: 19% 331M/1.78G [00:09<00:36, 39.6MB/s]
model.safetensors: 36% 320M/892M [00:09<00:17, 32.5MB/s]
model.safetensors: 37% 329M/892M [00:09<00:13, 41.9MB/s]
optimizer.pt: 19% 336M/1.78G [00:09<00:44, 32.5MB/s]
model.safetensors: 33% 290M/892M [00:08<00:16, 36.7MB/s]
optimizer.pt: 19% 344M/1.78G [00:09<00:35, 40.7MB/s]
model.safetensors: 33% 296M/892M [00:08<00:15, 37.5MB/s]
optimizer.pt: 20% 350M/1.78G [00:09<00:33, 43.0MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
model.safetensors: 38% 336M/892M [00:09<00:16, 34.6MB/s]
model.safetensors: 39% 344M/892M [00:09<00:13, 40.6MB/s]
model.safetensors: 39% 350M/892M [00:09<00:12, 44.7MB/s]
optimizer.pt: 20% 363M/1.78G [00:09<00:32, 43.2MB/s]
model.safetensors: 35% 308M/892M [00:09<00:18, 31.1MB/s]
model.safetensors: 40% 356M/892M [00:10<00:14, 36.4MB/s]
model.safetensors: 41% 363M/892M [00:10<00:12, 43.2MB/s]
model.safetensors: 41% 368M/892M [00:10<00:15, 33.7MB/s]
optimizer.pt: 22% 384M/1.78G [00:10<00:37, 36.9MB/s]
model.safetensors: 43% 382M/892M [00:10<00:09, 51.9MB/s]
model.safetensors: 37% 327M/892M [00:09<00:20, 27.3MB/s]
optimizer.pt: 22% 391M/1.78G [00:10<00:34, 40.6MB/s]
model.safetensors: 37% 332M/892M [00:09<00:18, 30.6MB/s]
model.safetensors: 44% 389M/892M [00:10<00:13, 37.6MB/s]
model.safetensors: 38% 336M/892M [00:10<00:20, 27.1MB/s]
optimizer.pt: 23% 403M/1.78G [00:10<00:37, 36.6MB/s]
model.safetensors: 38% 343M/892M [00:10<00:15, 34.8MB/s]
optimizer.pt: 23% 407M/1.78G [00:11<00:35, 38.5MB/s]
model.safetensors: 39% 350M/892M [00:10<00:13, 40.0MB/s]
model.safetensors: 46% 411M/892M [00:11<00:10, 45.1MB/s]
optimizer.pt: 23% 419M/1.78G [00:11<00:38, 35.8MB/s]
optimizer.pt: 24% 428M/1.78G [00:11<00:29, 46.6MB/s]
model.safetensors: 47% 417M/892M [00:11<00:11, 40.7MB/s]
model.safetensors: 47% 423M/892M [00:11<00:10, 43.9MB/s]
optimizer.pt: 24% 433M/1.78G [00:11<00:36, 37.5MB/s]
model.safetensors: 48% 431M/892M [00:11<00:09, 47.9MB/s]
optimizer.pt: 25% 439M/1.78G [00:11<00:33, 39.6MB/s]
model.safetensors: 49% 436M/892M [00:12<00:12, 37.6MB/s]
model.safetensors: 50% 445M/892M [00:12<00:10, 44.3MB/s]
optimizer.pt: 25% 448M/1.78G [00:12<00:38, 34.7MB/s]
model.safetensors: 43% 382M/892M [00:11<00:12, 39.7MB/s]
model.safetensors: 50% 450M/892M [00:12<00:11, 38.9MB/s]
model.safetensors: 51% 457M/892M [00:12<00:09, 44.5MB/s]
model.safetensors: 52% 464M/892M [00:12<00:08, 47.6MB/s]
optimizer.pt: 26% 464M/1.78G [00:12<00:36, 36.4MB/s]
model.safetensors: 45% 397M/892M [00:11<00:12, 38.9MB/s]
model.safetensors: 54% 478M/892M [00:12<00:08, 48.0MB/s]
optimizer.pt: 27% 480M/1.78G [00:12<00:33, 38.8MB/s]
model.safetensors: 45% 401M/892M [00:12<00:20, 23.4MB/s]
optimizer.pt: 28% 494M/1.78G [00:13<00:22, 56.3MB/s]
model.safetensors: 54% 484M/892M [00:13<00:11, 35.4MB/s]
model.safetensors: 55% 489M/892M [00:13<00:10, 36.8MB/s]
model.safetensors: 55% 494M/892M [00:13<00:09, 40.2MB/s]
model.safetensors: 47% 416M/892M [00:12<00:17, 27.4MB/s]
model.safetensors: 56% 499M/892M [00:13<00:11, 35.0MB/s]
model.safetensors: 56% 503M/892M [00:13<00:10, 35.7MB/s]
optimizer.pt: 29% 519M/1.78G [00:13<00:30, 41.4MB/s]
model.safetensors: 57% 510M/892M [00:13<00:09, 41.7MB/s]
optimizer.pt: 30% 526M/1.78G [00:13<00:29, 43.2MB/s]
model.safetensors: 58% 515M/892M [00:13<00:09, 38.8MB/s]
model.safetensors: 50% 446M/892M [00:13<00:10, 42.4MB/s]
model.safetensors: 58% 519M/892M [00:14<00:09, 38.2MB/s]
model.safetensors: 59% 524M/892M [00:14<00:09, 38.0MB/s]
model.safetensors: 51% 451M/892M [00:13<00:12, 36.7MB/s]
model.safetensors: 51% 456M/892M [00:13<00:11, 36.6MB/s]
model.safetensors: 59% 528M/892M [00:14<00:12, 28.4MB/s]
model.safetensors: 60% 536M/892M [00:14<00:09, 37.8MB/s]
model.safetensors: 61% 542M/892M [00:14<00:08, 42.2MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
model.safetensors: 53% 472M/892M [00:13<00:10, 39.9MB/s]
optimizer.pt: 31% 561M/1.78G [00:14<00:33, 37.0MB/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
model.safetensors: 62% 553M/892M [00:15<00:09, 36.8MB/s]
model.safetensors: 54% 480M/892M [00:14<00:12, 33.8MB/s]
model.safetensors: 55% 489M/892M [00:14<00:09, 42.9MB/s]
optimizer.pt: 32% 576M/1.78G [00:15<00:32, 36.8MB/s]
model.safetensors: 63% 560M/892M [00:15<00:11, 27.9MB/s]
model.safetensors: 56% 496M/892M [00:14<00:10, 37.6MB/s]
model.safetensors: 57% 505M/892M [00:14<00:08, 45.8MB/s]
model.safetensors: 64% 571M/892M [00:15<00:08, 38.3MB/s]
model.safetensors: 57% 511M/892M [00:14<00:08, 45.6MB/s]
model.safetensors: 65% 576M/892M [00:15<00:10, 30.9MB/s]
model.safetensors: 66% 586M/892M [00:15<00:07, 41.9MB/s]
model.safetensors: 59% 523M/892M [00:15<00:08, 41.8MB/s]
model.safetensors: 66% 592M/892M [00:16<00:08, 35.7MB/s]
optimizer.pt: 35% 621M/1.78G [00:16<00:24, 46.8MB/s]
model.safetensors: 67% 599M/892M [00:16<00:07, 41.6MB/s]
model.safetensors: 68% 606M/892M [00:16<00:06, 45.2MB/s]
optimizer.pt: 35% 628M/1.78G [00:16<00:30, 37.7MB/s]
model.safetensors: 69% 612M/892M [00:16<00:06, 40.9MB/s]
model.safetensors: 69% 616M/892M [00:16<00:06, 42.2MB/s]
model.safetensors: 70% 624M/892M [00:16<00:06, 39.4MB/s]
model.safetensors: 72% 639M/892M [00:17<00:04, 52.9MB/s]
optimizer.pt: 37% 657M/1.78G [00:17<00:29, 38.7MB/s]
model.safetensors: 72% 645M/892M [00:17<00:05, 43.7MB/s]
optimizer.pt: 38% 669M/1.78G [00:17<00:22, 49.2MB/s]
model.safetensors: 73% 650M/892M [00:17<00:05, 40.9MB/s]
optimizer.pt: 38% 675M/1.78G [00:17<00:25, 42.8MB/s]
model.safetensors: 74% 656M/892M [00:17<00:06, 36.6MB/s]
model.safetensors: 74% 663M/892M [00:17<00:05, 44.0MB/s]
model.safetensors: 63% 560M/892M [00:16<00:15, 21.6MB/s]
model.safetensors: 75% 669M/892M [00:17<00:05, 44.5MB/s]
model.safetensors: 64% 567M/892M [00:16<00:12, 26.8MB/s]
model.safetensors: 65% 576M/892M [00:17<00:08, 37.4MB/s]
model.safetensors: 76% 681M/892M [00:18<00:04, 46.4MB/s]
model.safetensors: 77% 686M/892M [00:18<00:04, 47.3MB/s]
model.safetensors: 78% 691M/892M [00:18<00:04, 41.6MB/s]
model.safetensors: 65% 581M/892M [00:17<00:13, 22.3MB/s]
model.safetensors: 79% 700M/892M [00:18<00:04, 43.0MB/s]
optimizer.pt: 40% 717M/1.78G [00:18<00:26, 39.9MB/s]
model.safetensors: 66% 585M/892M [00:17<00:13, 23.5MB/s]
model.safetensors: 66% 591M/892M [00:17<00:10, 28.2MB/s]
model.safetensors: 79% 705M/892M [00:18<00:06, 29.3MB/s]
model.safetensors: 67% 595M/892M [00:18<00:11, 26.0MB/s]
model.safetensors: 80% 711M/892M [00:19<00:05, 35.2MB/s]
model.safetensors: 80% 715M/892M [00:19<00:04, 36.6MB/s]
optimizer.pt: 41% 734M/1.78G [00:19<00:26, 39.4MB/s]
model.safetensors: 68% 605M/892M [00:18<00:09, 31.6MB/s]
model.safetensors: 81% 720M/892M [00:19<00:06, 26.5MB/s]
model.safetensors: 68% 609M/892M [00:18<00:10, 26.6MB/s]
optimizer.pt: 42% 749M/1.78G [00:19<00:24, 41.8MB/s]
model.safetensors: 82% 727M/892M [00:19<00:05, 32.5MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
model.safetensors: 82% 733M/892M [00:19<00:04, 35.3MB/s]
optimizer.pt: 42% 754M/1.78G [00:19<00:35, 29.3MB/s]
model.safetensors: 83% 737M/892M [00:19<00:05, 29.9MB/s]
model.safetensors: 83% 742M/892M [00:20<00:04, 34.3MB/s]
model.safetensors: 84% 748M/892M [00:20<00:03, 38.1MB/s]
model.safetensors: 71% 637M/892M [00:19<00:06, 38.2MB/s]
optimizer.pt: 43% 770M/1.78G [00:20<00:30, 33.0MB/s]
model.safetensors: 84% 752M/892M [00:20<00:04, 28.8MB/s]
model.safetensors: 85% 760M/892M [00:20<00:03, 39.1MB/s]
model.safetensors: 86% 766M/892M [00:20<00:03, 41.7MB/s]
model.safetensors: 73% 654M/892M [00:19<00:05, 40.2MB/s]
model.safetensors: 86% 771M/892M [00:20<00:03, 37.9MB/s]
model.safetensors: 87% 777M/892M [00:20<00:02, 42.4MB/s]
model.safetensors: 88% 782M/892M [00:20<00:02, 43.7MB/s]
model.safetensors: 74% 664M/892M [00:20<00:06, 34.3MB/s]
model.safetensors: 75% 672M/892M [00:20<00:05, 43.8MB/s]
model.safetensors: 88% 787M/892M [00:21<00:03, 34.4MB/s]
model.safetensors: 89% 793M/892M [00:21<00:02, 41.9MB/s]
model.safetensors: 90% 799M/892M [00:21<00:02, 44.6MB/s]
optimizer.pt: 46% 819M/1.78G [00:21<00:26, 36.9MB/s]
model.safetensors: 77% 685M/892M [00:20<00:05, 38.7MB/s]
optimizer.pt: 46% 825M/1.78G [00:21<00:24, 39.8MB/s]
model.safetensors: 77% 690M/892M [00:20<00:06, 33.2MB/s]
model.safetensors: 79% 702M/892M [00:20<00:03, 50.1MB/s]
optimizer.pt: 47% 832M/1.78G [00:21<00:28, 33.1MB/s]
optimizer.pt: 47% 846M/1.78G [00:21<00:18, 50.3MB/s]
model.safetensors: 79% 709M/892M [00:21<00:04, 42.4MB/s]
model.safetensors: 81% 720M/892M [00:21<00:03, 56.4MB/s]
optimizer.pt: 48% 853M/1.78G [00:22<00:21, 42.9MB/s]
optimizer.pt: 48% 864M/1.78G [00:22<00:23, 39.3MB/s]
model.safetensors: 82% 727M/892M [00:21<00:04, 34.0MB/s]
optimizer.pt: 49% 873M/1.78G [00:22<00:19, 45.6MB/s]
model.safetensors: 82% 733M/892M [00:21<00:04, 36.3MB/s]
model.safetensors: 91% 816M/892M [00:22<00:04, 18.3MB/s]
model.safetensors: 83% 738M/892M [00:22<00:05, 29.4MB/s]
optimizer.pt: 50% 885M/1.78G [00:22<00:22, 39.9MB/s]
model.safetensors: 83% 744M/892M [00:22<00:04, 34.3MB/s]
optimizer.pt: 50% 891M/1.78G [00:23<00:21, 41.9MB/s]
model.safetensors: 92% 822M/892M [00:23<00:03, 18.0MB/s]
optimizer.pt: 50% 896M/1.78G [00:23<00:26, 33.0MB/s]
model.safetensors: 93% 832M/892M [00:23<00:02, 26.5MB/s]
optimizer.pt: 51% 902M/1.78G [00:23<00:23, 37.4MB/s]
model.safetensors: 85% 759M/892M [00:22<00:03, 35.1MB/s]
optimizer.pt: 51% 909M/1.78G [00:23<00:20, 42.1MB/s]
model.safetensors: 94% 838M/892M [00:23<00:02, 24.7MB/s]
optimizer.pt: 51% 914M/1.78G [00:23<00:26, 32.4MB/s]
optimizer.pt: 52% 927M/1.78G [00:23<00:17, 49.7MB/s]
model.safetensors: 95% 848M/892M [00:24<00:01, 26.0MB/s]
model.safetensors: 96% 857M/892M [00:24<00:01, 32.1MB/s]
model.safetensors: 97% 862M/892M [00:24<00:00, 34.4MB/s]
optimizer.pt: 53% 938M/1.78G [00:24<00:21, 38.9MB/s]
model.safetensors: 88% 784M/892M [00:23<00:03, 29.1MB/s]
model.safetensors: 89% 793M/892M [00:23<00:02, 38.7MB/s]
optimizer.pt: 53% 944M/1.78G [00:24<00:24, 33.8MB/s]
optimizer.pt: 54% 957M/1.78G [00:24<00:16, 51.6MB/s]
model.safetensors: 97% 867M/892M [00:24<00:01, 22.0MB/s]
model.safetensors: 98% 873M/892M [00:24<00:00, 26.1MB/s]
model.safetensors: 91% 814M/892M [00:24<00:01, 44.3MB/s]
model.safetensors: 99% 879M/892M [00:25<00:00, 29.1MB/s]
optimizer.pt: 54% 970M/1.78G [00:25<00:19, 41.3MB/s]
model.safetensors: 100% 889M/892M [00:25<00:00, 32.8MB/s]
model.safetensors: 93% 827M/892M [00:24<00:01, 41.4MB/s]
optimizer.pt: 55% 976M/1.78G [00:25<00:22, 35.3MB/s]
optimizer.pt: 55% 983M/1.78G [00:25<00:20, 39.6MB/s]
model.safetensors: 100% 892M/892M [00:25<00:00, 34.8MB/s]
model.safetensors: 94% 842M/892M [00:24<00:01, 48.6MB/s]
optimizer.pt: 56% 992M/1.78G [00:25<00:24, 32.6MB/s]
model.safetensors: 95% 848M/892M [00:24<00:01, 40.8MB/s]
optimizer.pt: 56% 1.00G/1.78G [00:25<00:18, 42.0MB/s]
model.safetensors: 96% 856M/892M [00:25<00:00, 48.8MB/s]
Upload 11 LFS files: 9% 1/11 [00:25<04:19, 25.95s/it]
optimizer.pt: 56% 1.01G/1.78G [00:26<00:17, 44.3MB/s]
model.safetensors: 97% 862M/892M [00:25<00:00, 49.3MB/s]
optimizer.pt: 57% 1.01G/1.78G [00:26<00:19, 39.1MB/s]
model.safetensors: 97% 868M/892M [00:25<00:00, 41.6MB/s]
optimizer.pt: 57% 1.02G/1.78G [00:26<00:17, 42.6MB/s]
model.safetensors: 98% 874M/892M [00:25<00:00, 44.6MB/s]
model.safetensors: 99% 880M/892M [00:25<00:00, 37.8MB/s]
optimizer.pt: 57% 1.02G/1.78G [00:26<00:22, 33.6MB/s]
model.safetensors: 100% 889M/892M [00:25<00:00, 49.3MB/s]
optimizer.pt: 58% 1.03G/1.78G [00:26<00:18, 40.2MB/s]
model.safetensors: 100% 892M/892M [00:25<00:00, 34.3MB/s]
optimizer.pt: 59% 1.05G/1.78G [00:27<00:18, 39.9MB/s]
optimizer.pt: 59% 1.06G/1.78G [00:27<00:21, 33.9MB/s]
optimizer.pt: 60% 1.07G/1.78G [00:27<00:14, 48.8MB/s]
optimizer.pt: 60% 1.08G/1.78G [00:27<00:18, 38.0MB/s]
optimizer.pt: 61% 1.09G/1.78G [00:27<00:14, 49.0MB/s]
optimizer.pt: 61% 1.09G/1.78G [00:28<00:16, 42.4MB/s]
optimizer.pt: 62% 1.10G/1.78G [00:28<00:16, 41.0MB/s]
optimizer.pt: 63% 1.12G/1.78G [00:28<00:11, 56.8MB/s]
optimizer.pt: 63% 1.13G/1.78G [00:29<00:19, 33.8MB/s]
optimizer.pt: 64% 1.14G/1.78G [00:29<00:18, 35.4MB/s]
optimizer.pt: 65% 1.15G/1.78G [00:29<00:12, 49.1MB/s]
optimizer.pt: 65% 1.16G/1.78G [00:29<00:13, 46.1MB/s]
optimizer.pt: 65% 1.17G/1.78G [00:30<00:17, 35.1MB/s]
optimizer.pt: 66% 1.18G/1.78G [00:30<00:12, 48.2MB/s]
optimizer.pt: 67% 1.19G/1.78G [00:30<00:14, 41.5MB/s]
optimizer.pt: 67% 1.20G/1.78G [00:30<00:16, 36.1MB/s]
optimizer.pt: 68% 1.21G/1.78G [00:30<00:11, 49.6MB/s]
optimizer.pt: 69% 1.22G/1.78G [00:31<00:15, 35.9MB/s]
optimizer.pt: 69% 1.23G/1.78G [00:31<00:18, 29.2MB/s]
optimizer.pt: 70% 1.25G/1.78G [00:31<00:12, 41.6MB/s]
optimizer.pt: 70% 1.25G/1.78G [00:32<00:13, 38.2MB/s]
optimizer.pt: 71% 1.26G/1.78G [00:32<00:12, 40.6MB/s]
optimizer.pt: 72% 1.28G/1.78G [00:32<00:09, 53.3MB/s]
optimizer.pt: 72% 1.28G/1.78G [00:32<00:10, 47.6MB/s]
optimizer.pt: 73% 1.30G/1.78G [00:33<00:11, 42.7MB/s]
optimizer.pt: 73% 1.31G/1.78G [00:33<00:08, 56.5MB/s]
optimizer.pt: 74% 1.32G/1.78G [00:33<00:09, 46.9MB/s]
optimizer.pt: 74% 1.33G/1.78G [00:33<00:14, 31.7MB/s]
optimizer.pt: 75% 1.34G/1.78G [00:34<00:09, 44.1MB/s]
optimizer.pt: 76% 1.35G/1.78G [00:34<00:12, 34.9MB/s]
optimizer.pt: 76% 1.36G/1.78G [00:34<00:11, 35.3MB/s]
optimizer.pt: 77% 1.37G/1.78G [00:34<00:08, 47.7MB/s]
optimizer.pt: 77% 1.38G/1.78G [00:35<00:09, 41.2MB/s]
optimizer.pt: 78% 1.39G/1.78G [00:35<00:12, 30.8MB/s]
optimizer.pt: 79% 1.41G/1.78G [00:35<00:08, 43.1MB/s]
optimizer.pt: 79% 1.41G/1.78G [00:36<00:09, 40.6MB/s]
optimizer.pt: 80% 1.42G/1.78G [00:36<00:10, 35.8MB/s]
optimizer.pt: 81% 1.44G/1.78G [00:36<00:07, 49.1MB/s]
optimizer.pt: 81% 1.45G/1.78G [00:36<00:07, 42.8MB/s]
optimizer.pt: 82% 1.46G/1.78G [00:37<00:08, 39.9MB/s]
optimizer.pt: 82% 1.47G/1.78G [00:37<00:05, 53.8MB/s]
optimizer.pt: 83% 1.48G/1.78G [00:37<00:07, 43.4MB/s]
optimizer.pt: 83% 1.49G/1.78G [00:37<00:07, 38.8MB/s]
optimizer.pt: 84% 1.50G/1.78G [00:37<00:05, 52.5MB/s]
optimizer.pt: 85% 1.51G/1.78G [00:38<00:07, 38.2MB/s]
optimizer.pt: 85% 1.52G/1.78G [00:38<00:07, 35.6MB/s]
optimizer.pt: 86% 1.53G/1.78G [00:38<00:05, 48.5MB/s]
optimizer.pt: 86% 1.54G/1.78G [00:39<00:06, 39.2MB/s]
optimizer.pt: 87% 1.55G/1.78G [00:39<00:05, 38.9MB/s]
optimizer.pt: 88% 1.57G/1.78G [00:39<00:04, 51.7MB/s]
optimizer.pt: 88% 1.57G/1.78G [00:39<00:04, 48.2MB/s]
optimizer.pt: 89% 1.58G/1.78G [00:39<00:04, 40.3MB/s]
optimizer.pt: 90% 1.60G/1.78G [00:40<00:03, 54.1MB/s]
optimizer.pt: 90% 1.61G/1.78G [00:40<00:03, 47.2MB/s]
optimizer.pt: 91% 1.62G/1.78G [00:40<00:04, 36.7MB/s]
optimizer.pt: 91% 1.63G/1.78G [00:40<00:03, 49.6MB/s]
optimizer.pt: 92% 1.64G/1.78G [00:41<00:04, 31.0MB/s]
optimizer.pt: 92% 1.65G/1.78G [00:41<00:04, 30.7MB/s]
optimizer.pt: 93% 1.66G/1.78G [00:41<00:02, 42.7MB/s]
optimizer.pt: 94% 1.67G/1.78G [00:42<00:02, 39.4MB/s]
optimizer.pt: 94% 1.68G/1.78G [00:42<00:02, 47.8MB/s]
optimizer.pt: 95% 1.69G/1.78G [00:42<00:02, 47.7MB/s]
optimizer.pt: 95% 1.70G/1.78G [00:42<00:02, 41.4MB/s]
optimizer.pt: 96% 1.71G/1.78G [00:42<00:01, 56.8MB/s]
optimizer.pt: 96% 1.72G/1.78G [00:43<00:01, 47.8MB/s]
optimizer.pt: 97% 1.73G/1.78G [00:43<00:01, 40.4MB/s]
optimizer.pt: 98% 1.74G/1.78G [00:43<00:00, 54.0MB/s]
optimizer.pt: 98% 1.75G/1.78G [00:43<00:00, 46.2MB/s]
optimizer.pt: 99% 1.76G/1.78G [00:44<00:00, 43.0MB/s]
optimizer.pt: 99% 1.77G/1.78G [00:44<00:00, 56.2MB/s]
optimizer.pt: 100% 1.78G/1.78G [00:44<00:00, 39.8MB/s]
Upload 11 LFS files: 100% 11/11 [00:45<00:00, 4.11s/it]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO Killing PID: 7899
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
from autotrain-advanced.
Tried several things in a fork https://github.com/tombenj/autotrain-advanced/commits/length/ such as adding max_new_tokens
to the model generation:
tombenj@c9741b9
As suggested here:
https://www.markhneedham.com/blog/2023/06/19/huggingface-max-length-generation-length-deprecated/
But still getting cut out 20 length token responses:
@abhishekkrthakur can you point to a direction how to resolve this?
from autotrain-advanced.
ohh those are the default parameters. you can change the default params: https://huggingface.co/docs/hub/models-widgets#example-outputs
from autotrain-advanced.
@abhishekkrthakur it has nothing to do with the default params. training facebook's Bart results in good output, training t5's give max 20 token lengths.
from autotrain-advanced.
can you share the trained model repo?
from autotrain-advanced.
@abhishekkrthakur yep here is an example:
https://huggingface.co/tombenj/tuple-1k-t5
Getting only 20 tokens as output.
from autotrain-advanced.
changing params here have no effect: https://huggingface.co/tombenj/tuple-1k-t5/blob/main/config.json#L29 ?
from autotrain-advanced.
@abhishekkrthakur changed here and getting the same max 20 token output:
https://huggingface.co/tombenj/tuple-1k-t5/commit/0868248619d5a457bc52a13af26af94d93a436b1
https://huggingface.co/tombenj/tuple-1k-t5/commit/6823fe355c7fd90a9fd0bfa6b72e8784bebb0b16
from autotrain-advanced.
@abhishekkrthakur any updates on this?
from autotrain-advanced.
This issue is stale because it has been open for 15 days with no activity.
from autotrain-advanced.
This issue was closed because it has been inactive for 2 days since being marked as stale.
from autotrain-advanced.
Related Issues (20)
- Gemma2 support[FEATURE REQUEST] HOT 1
- [FEATURE REQUEST] Allow us to save our models locally HOT 15
- maybe problem with token HOT 1
- [FEATURE REQUEST] Add support for multi label text classification. HOT 2
- [BUG] This error occurs when push_to_hub is true HOT 8
- [BUG] Syntax Error when using chat template HOT 6
- Autotrain do not recognize nvidia rtx4060 gpu HOT 7
- Having issues with duplicating huggingface advanced autotrain HOT 4
- Concerning sequence to sequence task using autotrain HOT 3
- [BUG] Unable to auto train paligemma HOT 2
- [FEATURE REQUEST] Documentation for VLM data HOT 7
- Truncating CV Models HOT 1
- Add --unsloth to autotrain llm CLI HOT 1
- Nvdia geospatial/running HOT 1
- Args' object has no attribute 'prior_loss_weight - [BUG] HOT 1
- Billing HOT 1
- [BUG]"GET /ui/ HTTP/1.1" 401 Unauthorized HOT 7
- Zero gpu driver issue HOT 1
- [BUG] Deprecated positional argument(s) used in SFTTrainer, please use the SFTConfig to set these arguments instead. How to mitigate this? HOT 4
- [BUG] lora files inert / don't work with fooocus, only kohya version do but they are limited HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autotrain-advanced.