GithubHelp home page GithubHelp logo

Comments (16)

tombenj avatar tombenj commented on September 22, 2024

So it seems the Seq2Seq in Google models such as t5, mt5 etc are limited to 20 tokens output due to this? ie the required params are not passed through?

from autotrain-advanced.

abhishekkrthakur avatar abhishekkrthakur commented on September 22, 2024

no. thats just validation. inferencce doesnt matter

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

no. thats just validation. inferencce doesnt matter

So any idea why the inference output is always 20 tokens while when I train using Bart I get 256+?

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

Seems as though the Seq2Seq args aren't passing thorugh (especially for google models).

from autotrain-advanced.

abhishekkrthakur avatar abhishekkrthakur commented on September 22, 2024

taking a look!

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

Full trace attached. The model is still generating only max 20 tokens:
[```
nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Package punkt is already up-to-date!

WARNING Parameters not supplied by user and set to default: push_to_hub, model_ref, auto_find_batch_size, add_eos_token, data_path, lr, project_name, disable_gradient_checkpointing, logging_steps, optimizer, token, seed, lora_dropout, lora_r, rejected_text_column, batch_size, prompt_text_column, model_max_length, weight_decay, max_grad_norm, merge_adapter, gradient_accumulation, use_flash_attention_2, scheduler, valid_split, trainer, text_column, username, repo_id, lora_alpha, model, save_strategy, warmup_ratio, evaluation_strategy, save_total_limit, train_split, dpo_beta
WARNING Parameters not supplied by user and set to default: batch_size, epochs, log, weight_decay, max_grad_norm, auto_find_batch_size, max_seq_length, gradient_accumulation, scheduler, data_path, lr, valid_split, text_column, username, project_name, target_column, repo_id, logging_steps, optimizer, token, model, save_strategy, seed, warmup_ratio, save_total_limit, evaluation_strategy, push_to_hub, train_split
WARNING Parameters not supplied by user and set to default: batch_size, epochs, log, weight_decay, max_grad_norm, auto_find_batch_size, gradient_accumulation, scheduler, data_path, lr, username, valid_split, image_column, project_name, repo_id, target_column, logging_steps, optimizer, token, model, save_strategy, seed, warmup_ratio, save_total_limit, evaluation_strategy, push_to_hub, train_split
WARNING Parameters supplied but not used: target_modules
WARNING Parameters not supplied by user and set to default: epochs, auto_find_batch_size, data_path, lr, project_name, logging_steps, token, optimizer, seed, lora_dropout, lora_r, target_modules, max_target_length, batch_size, weight_decay, max_grad_norm, max_seq_length, gradient_accumulation, scheduler, username, valid_split, text_column, target_column, repo_id, lora_alpha, model, save_strategy, warmup_ratio, evaluation_strategy, save_total_limit, train_split, peft, push_to_hub, quantization
WARNING Parameters not supplied by user and set to default: id_column, categorical_columns, num_trials, numerical_columns, data_path, username, valid_split, repo_id, project_name, task, token, target_columns, model, seed, train_split, push_to_hub, time_limit
WARNING Parameters not supplied by user and set to default: resume_from_checkpoint, epochs, lr_power, tokenizer_max_length, validation_images, adam_beta1, num_cycles, num_class_images, project_name, token, pre_compute_text_embeddings, sample_batch_size, allow_tf32, xl, num_validation_images, seed, scale_lr, validation_epochs, checkpoints_total_limit, class_prompt, revision, rank, adam_weight_decay, prior_preservation, class_image_path, prior_loss_weight, max_grad_norm, adam_epsilon, scheduler, username, tokenizer, text_encoder_use_attention_mask, image_path, dataloader_num_workers, repo_id, class_labels_conditioning, prior_generation_precision, model, adam_beta2, validation_prompt, local_rank, checkpointing_steps, center_crop, push_to_hub, logging, warmup_steps
WARNING Parameters not supplied by user and set to default: tags_column, batch_size, epochs, log, tokens_column, weight_decay, max_grad_norm, auto_find_batch_size, max_seq_length, gradient_accumulation, scheduler, data_path, lr, valid_split, username, repo_id, project_name, logging_steps, optimizer, token, model, save_strategy, seed, warmup_ratio, save_total_limit, evaluation_strategy, push_to_hub, train_split
INFO AutoTrain Public URL: NgrokTunnel: "https://b320-34-72-237-89.ngrok-free.app/" -> "http://localhost:7860/"
INFO Please wait for the app to load...
INFO ***
INFO: Started server process [7599]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:7860/ (Press CTRL+C to quit)
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET / HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /logo.png HTTP/1.1" 200 OK
INFO Task: llm:sft
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /params/llm%3Asft HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /model_choices/llm%3Asft HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /favicon.ico HTTP/1.1" 404 Not Found
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Task: seq2seq
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /params/seq2seq HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /model_choices/seq2seq HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO hardware: Local
INFO Running jobs: []
INFO Task: seq2seq
INFO Column mapping: {'text': 'text', 'label': 'target'}
INFO Dataset: autotrain-gsxqu-k795g (seq2seq)
Train data: [<tempfile.SpooledTemporaryFile object at 0x7e16bada5b40>]
Valid data: []
Column mapping: {'text': 'text', 'label': 'target'}

Saving the dataset (1/1 shards): 100% 800/800 [00:00<00:00, 238160.49 examples/s]
Saving the dataset (1/1 shards): 100% 200/200 [00:00<00:00, 86072.32 examples/s]

WARNING Parameters not supplied by user and set to default: train_split
WARNING Parameters supplied but not used: model_max_length, max_length, max_new_tokens
INFO Starting local training...
INFO {"data_path":"autotrain-gsxqu-k795g/autotrain-data","model":"google-t5/t5-base","username":"tombenj","seed":42,"train_split":"train","valid_split":"validation","project_name":"autotrain-gsxqu-k795g","token":"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","push_to_hub":true,"text_column":"autotrain_text","target_column":"autotrain_label","repo_id":"tombenj/autotrain-gsxqu-k795g","lr":0.00005,"epochs":1,"max_seq_length":1024,"max_target_length":1024,"batch_size":8,"warmup_ratio":0.1,"gradient_accumulation":1,"optimizer":"adamw_torch","scheduler":"linear","weight_decay":0.0,"max_grad_norm":1.0,"logging_steps":-1,"evaluation_strategy":"epoch","auto_find_batch_size":false,"mixed_precision":"fp16","save_total_limit":1,"save_strategy":"epoch","peft":false,"quantization":null,"lora_r":16,"lora_alpha":32,"lora_dropout":0.05,"target_modules":["all-linear"]}
INFO ['accelerate', 'launch', '--num_machines', '1', '--num_processes', '1', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.seq2seq', '--training_config', 'autotrain-gsxqu-k795g/training_params.json']
INFO Training PID: 7899
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "POST /create_project HTTP/1.1" 200 OK
The following values were not passed to accelerate launch and had defaults used instead:
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
🚀 INFO | 2024-03-13 09:15:12 | main🚋45 - Starting training...
🚀 INFO | 2024-03-13 09:15:12 | main🚋46 - Training config: {'data_path': 'autotrain-gsxqu-k795g/autotrain-data', 'model': 'google-t5/t5-base', 'username': 'tombenj', 'seed': 42, 'train_split': 'train', 'valid_split': 'validation', 'project_name': 'autotrain-gsxqu-k795g', 'token': '*****', 'push_to_hub': True, 'text_column': 'autotrain_text', 'target_column': 'autotrain_label', 'repo_id': 'tombenj/autotrain-gsxqu-k795g', 'lr': 5e-05, 'epochs': 1, 'max_seq_length': 1024, 'max_target_length': 1024, 'batch_size': 8, 'warmup_ratio': 0.1, 'gradient_accumulation': 1, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'logging_steps': -1, 'evaluation_strategy': 'epoch', 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'save_total_limit': 1, 'save_strategy': 'epoch', 'peft': False, 'quantization': None, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'target_modules': ['all-linear']}
🚀 INFO | 2024-03-13 09:15:12 | main🚋53 - loading dataset from disk
🚀 INFO | 2024-03-13 09:15:12 | main🚋64 - loading dataset from disk
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
/usr/local/lib/python3.10/dist-packages/transformers/models/t5/tokenization_t5_fast.py:171: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with truncation is True.

  • Be aware that you SHOULD NOT rely on google-t5/t5-base automatically truncating your input to 512 when padding/encoding.
  • If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with model_max_length or pass max_length when encoding/padding.
  • To avoid this warning, please instantiate this tokenizer with model_max_length set to your preferred value.
    warnings.warn(

INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
{'loss': 10.2023, 'grad_norm': 29.899457931518555, 'learning_rate': 1.5e-05, 'epoch': 0.05}
8% 8/100 [00:04<00:39, 2.33it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 6.8607, 'grad_norm': 57.67634963989258, 'learning_rate': 3.5e-05, 'epoch': 0.1}
{'loss': 2.1809, 'grad_norm': 4.584061622619629, 'learning_rate': 4.888888888888889e-05, 'epoch': 0.15}
{'loss': 1.2057, 'grad_norm': 3.255554437637329, 'learning_rate': 4.6111111111111115e-05, 'epoch': 0.2}
20% 20/100 [00:09<00:34, 2.35it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.8056, 'grad_norm': 2.723172426223755, 'learning_rate': 4.3333333333333334e-05, 'epoch': 0.25}
{'loss': 0.5849, 'grad_norm': 1.7794570922851562, 'learning_rate': 4.055555555555556e-05, 'epoch': 0.3}
32% 32/100 [00:14<00:28, 2.36it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.418, 'grad_norm': 1.8489391803741455, 'learning_rate': 3.777777777777778e-05, 'epoch': 0.35}
{'loss': 0.3179, 'grad_norm': 1.1671098470687866, 'learning_rate': 3.5e-05, 'epoch': 0.4}
44% 44/100 [00:19<00:23, 2.36it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.2652, 'grad_norm': 1.281832218170166, 'learning_rate': 3.222222222222223e-05, 'epoch': 0.45}
{'loss': 0.2405, 'grad_norm': 0.9086970686912537, 'learning_rate': 2.9444444444444448e-05, 'epoch': 0.5}
{'loss': 0.2329, 'grad_norm': 1.1303473711013794, 'learning_rate': 2.6666666666666667e-05, 'epoch': 0.55}
55% 55/100 [00:24<00:19, 2.32it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.2199, 'grad_norm': 1.0673601627349854, 'learning_rate': 2.3888888888888892e-05, 'epoch': 0.6}
{'loss': 0.2137, 'grad_norm': 0.8874663710594177, 'learning_rate': 2.111111111111111e-05, 'epoch': 0.65}
67% 67/100 [00:29<00:14, 2.35it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.1876, 'grad_norm': 0.7264275550842285, 'learning_rate': 1.8333333333333333e-05, 'epoch': 0.7}
{'loss': 0.1834, 'grad_norm': 0.8036168217658997, 'learning_rate': 1.5555555555555555e-05, 'epoch': 0.75}
78% 78/100 [00:34<00:09, 2.30it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
{'loss': 0.1952, 'grad_norm': 0.6779347062110901, 'learning_rate': 1.2777777777777777e-05, 'epoch': 0.8}
{'loss': 0.1827, 'grad_norm': 0.7915838360786438, 'learning_rate': 1e-05, 'epoch': 0.85}
{'loss': 0.187, 'grad_norm': 0.8215489387512207, 'learning_rate': 7.222222222222222e-06, 'epoch': 0.9}
{'loss': 0.1734, 'grad_norm': 0.7938928604125977, 'learning_rate': 4.444444444444445e-06, 'epoch': 0.95}
{'loss': 0.172, 'grad_norm': 0.6198846697807312, 'learning_rate': 1.6666666666666667e-06, 'epoch': 1.0}
100% 100/100 [00:43<00:00, 2.36it/s]/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1178: UserWarning: Using the model-agnostic default max_length (=20) to control the generation length. We recommend setting max_new_tokens to control the maximum length of the generation.
warnings.warn(

0% 0/13 [00:00<?, ?it/s]
15% 2/13 [00:00<00:05, 2.07it/s]
23% 3/13 [00:02<00:07, 1.36it/s]
31% 4/13 [00:02<00:07, 1.25it/s]
38% 5/13 [00:03<00:06, 1.23it/s]
46% 6/13 [00:04<00:05, 1.20it/s]
54% 7/13 [00:05<00:05, 1.19it/s]
62% 8/13 [00:06<00:04, 1.18it/s]
69% 9/13 [00:07<00:03, 1.17it/s]
77% 10/13 [00:08<00:02, 1.16it/s]
85% 11/13 [00:09<00:01, 1.15it/s]
92% 12/13 [00:09<00:00, 1.15it/s]

{'eval_loss': 0.1557321548461914, 'eval_rouge1': 14.9506, 'eval_rouge2': 12.1047, 'eval_rougeL': 14.938, 'eval_rougeLsum': 14.9251, 'eval_gen_len': 19.0, 'eval_runtime': 12.0496, 'eval_samples_per_second': 16.598, 'eval_steps_per_second': 1.079, 'epoch': 1.0}
100% 100/100 [00:55<00:00, 2.36it/s]
100% 13/13 [00:11<00:00, 1.25it/s]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK

INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
There were missing keys in the checkpoint model loaded: ['encoder.embed_tokens.weight', 'decoder.embed_tokens.weight', 'lm_head.weight'].
{'train_runtime': 78.2901, 'train_samples_per_second': 10.218, 'train_steps_per_second': 1.277, 'train_loss': 1.2514769697189332, 'epoch': 1.0}
100% 100/100 [01:18<00:00, 1.28it/s]
🚀 INFO | 2024-03-13 09:16:38 | main🚋204 - Finished training, saving model...
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1178: UserWarning: Using the model-agnostic default max_length (=20) to control the generation length. We recommend setting max_new_tokens to control the maximum length of the generation.
warnings.warn(
15% 2/13 [00:01<00:05, 1.92it/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
54% 7/13 [00:05<00:05, 1.19it/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
100% 13/13 [00:11<00:00, 1.18it/s]
🚀 INFO | 2024-03-13 09:16:54 | main🚋218 - Pushing model to hub...
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
model.safetensors: 0% 0.00/892M [00:00<?, ?B/s]
rng_state.pth: 0% 0.00/14.2k [00:00<?, ?B/s]

optimizer.pt: 0% 0.00/1.78G [00:00<?, ?B/s]

spiece.model: 0% 0.00/792k [00:00<?, ?B/s]

Upload 11 LFS files: 0% 0/11 [00:00<?, ?it/s]

scheduler.pt: 0% 0.00/1.06k [00:00<?, ?B/s]

optimizer.pt: 0% 16.4k/1.78G [00:00<10:59:15, 45.1kB/s]

spiece.model: 2% 16.4k/792k [00:00<00:17, 45.2kB/s]

model.safetensors: 0% 16.4k/892M [00:00<5:48:26, 42.6kB/s]
scheduler.pt: 100% 1.06k/1.06k [00:00<00:00, 2.36kB/s]

rng_state.pth: 100% 14.2k/14.2k [00:00<00:00, 28.2kB/s]
spiece.model: 100% 792k/792k [00:00<00:00, 1.16MB/s]

optimizer.pt: 1% 16.0M/1.78G [00:00<01:03, 28.0MB/s]

optimizer.pt: 1% 25.4M/1.78G [00:00<00:41, 42.6MB/s]
model.safetensors: 2% 16.0M/892M [00:00<00:39, 22.0MB/s]

training_args.bin: 100% 5.05k/5.05k [00:00<00:00, 86.1kB/s]

model.safetensors: 3% 22.9M/892M [00:01<00:27, 31.1MB/s]

events.out.tfevents.1710321319.a7547a852de5.7919.0: 100% 10.6k/10.6k [00:00<00:00, 91.3kB/s]

events.out.tfevents.1710321414.a7547a852de5.7919.1: 0% 0.00/603 [00:00<?, ?B/s]

model.safetensors: 1% 8.21M/892M [00:00<00:25, 35.0MB/s]

events.out.tfevents.1710321414.a7547a852de5.7919.1: 100% 603/603 [00:00<00:00, 5.20kB/s]

model.safetensors: 2% 14.6M/892M [00:00<00:19, 45.0MB/s]
spiece.model: 0% 0.00/792k [00:00<?, ?B/s]

model.safetensors: 4% 32.0M/892M [00:01<00:34, 25.0MB/s]

model.safetensors: 2% 19.2M/892M [00:00<00:29, 29.3MB/s]

spiece.model: 100% 792k/792k [00:00<00:00, 2.65MB/s]
model.safetensors: 5% 40.8M/892M [00:01<00:25, 33.7MB/s]

training_args.bin: 100% 5.05k/5.05k [00:00<00:00, 65.7kB/s]

model.safetensors: 3% 24.2M/892M [00:00<00:27, 31.7MB/s]

model.safetensors: 5% 46.4M/892M [00:01<00:23, 35.8MB/s]

model.safetensors: 3% 28.2M/892M [00:00<00:25, 33.6MB/s]

optimizer.pt: 3% 61.0M/1.78G [00:01<00:41, 41.5MB/s]

model.safetensors: 6% 51.0M/892M [00:01<00:27, 30.7MB/s]

model.safetensors: 7% 62.1M/892M [00:02<00:18, 45.0MB/s]

optimizer.pt: 4% 73.6M/1.78G [00:02<00:41, 41.7MB/s]

model.safetensors: 8% 68.1M/892M [00:02<00:26, 31.5MB/s]

model.safetensors: 4% 39.4M/892M [00:01<00:44, 19.1MB/s]

model.safetensors: 9% 79.0M/892M [00:02<00:19, 42.3MB/s]

optimizer.pt: 5% 87.1M/1.78G [00:02<00:47, 35.9MB/s]

model.safetensors: 5% 44.3M/892M [00:01<00:36, 23.2MB/s]

model.safetensors: 10% 84.9M/892M [00:02<00:24, 32.9MB/s]

model.safetensors: 11% 94.4M/892M [00:02<00:19, 41.0MB/s]

optimizer.pt: 6% 101M/1.78G [00:02<00:46, 36.2MB/s]

model.safetensors: 6% 54.1M/892M [00:02<00:32, 25.7MB/s]

optimizer.pt: 6% 107M/1.78G [00:03<00:42, 39.8MB/s]

model.safetensors: 11% 100M/892M [00:03<00:24, 33.0MB/s]

model.safetensors: 13% 112M/892M [00:03<00:17, 45.7MB/s]

model.safetensors: 13% 118M/892M [00:03<00:21, 35.5MB/s]

model.safetensors: 14% 127M/892M [00:03<00:17, 43.4MB/s]

optimizer.pt: 7% 128M/1.78G [00:03<00:50, 32.6MB/s]

model.safetensors: 8% 72.9M/892M [00:02<00:33, 24.5MB/s]

optimizer.pt: 8% 135M/1.78G [00:03<00:42, 38.6MB/s]

model.safetensors: 15% 133M/892M [00:04<00:23, 32.3MB/s]

model.safetensors: 9% 80.0M/892M [00:03<00:34, 23.6MB/s]

optimizer.pt: 8% 147M/1.78G [00:04<00:45, 35.8MB/s]

model.safetensors: 10% 88.4M/892M [00:03<00:25, 31.3MB/s]

model.safetensors: 16% 144M/892M [00:04<00:20, 36.0MB/s]

model.safetensors: 11% 95.0M/892M [00:03<00:21, 36.8MB/s]

model.safetensors: 18% 160M/892M [00:04<00:15, 45.9MB/s]

model.safetensors: 11% 100M/892M [00:03<00:26, 30.3MB/s] INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK

optimizer.pt: 9% 162M/1.78G [00:04<00:52, 30.8MB/s]

model.safetensors: 12% 111M/892M [00:03<00:17, 44.5MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK

model.safetensors: 19% 165M/892M [00:04<00:20, 36.3MB/s]

model.safetensors: 19% 172M/892M [00:05<00:18, 39.3MB/s]

model.safetensors: 14% 124M/892M [00:04<00:19, 40.3MB/s]

model.safetensors: 20% 177M/892M [00:05<00:21, 33.7MB/s]

model.safetensors: 21% 187M/892M [00:05<00:15, 46.4MB/s]

model.safetensors: 14% 129M/892M [00:04<00:22, 34.2MB/s]

optimizer.pt: 11% 190M/1.78G [00:05<00:41, 38.1MB/s]

model.safetensors: 15% 135M/892M [00:04<00:20, 36.4MB/s]

model.safetensors: 22% 193M/892M [00:05<00:19, 35.6MB/s]

model.safetensors: 23% 204M/892M [00:05<00:14, 47.1MB/s]

optimizer.pt: 11% 201M/1.78G [00:05<00:42, 37.3MB/s]

model.safetensors: 17% 149M/892M [00:04<00:21, 35.1MB/s]

model.safetensors: 24% 210M/892M [00:05<00:16, 41.0MB/s]

model.safetensors: 17% 155M/892M [00:05<00:19, 38.1MB/s]

model.safetensors: 24% 215M/892M [00:06<00:15, 42.6MB/s]

model.safetensors: 25% 221M/892M [00:06<00:15, 43.3MB/s]

model.safetensors: 18% 160M/892M [00:05<00:22, 33.0MB/s]

model.safetensors: 18% 164M/892M [00:05<00:21, 34.2MB/s]

optimizer.pt: 12% 222M/1.78G [00:06<00:39, 40.0MB/s]

model.safetensors: 19% 172M/892M [00:05<00:16, 44.2MB/s]

model.safetensors: 25% 226M/892M [00:06<00:24, 27.1MB/s]

model.safetensors: 20% 177M/892M [00:05<00:20, 35.6MB/s]

model.safetensors: 26% 232M/892M [00:06<00:21, 30.7MB/s]

model.safetensors: 27% 237M/892M [00:06<00:19, 32.8MB/s]

model.safetensors: 21% 190M/892M [00:05<00:16, 42.8MB/s]

model.safetensors: 27% 241M/892M [00:07<00:22, 29.1MB/s]

model.safetensors: 22% 195M/892M [00:06<00:18, 37.1MB/s]

model.safetensors: 28% 247M/892M [00:07<00:19, 33.4MB/s]

model.safetensors: 28% 252M/892M [00:07<00:17, 36.9MB/s]

model.safetensors: 23% 207M/892M [00:06<00:16, 41.6MB/s]

optimizer.pt: 15% 260M/1.78G [00:07<00:42, 35.8MB/s]

model.safetensors: 29% 256M/892M [00:07<00:21, 29.4MB/s]

model.safetensors: 30% 264M/892M [00:07<00:16, 37.3MB/s]

model.safetensors: 30% 271M/892M [00:07<00:14, 43.1MB/s]

model.safetensors: 25% 223M/892M [00:06<00:15, 41.9MB/s]

optimizer.pt: 15% 276M/1.78G [00:07<00:42, 35.1MB/s]

optimizer.pt: 16% 287M/1.78G [00:07<00:32, 46.5MB/s]

model.safetensors: 31% 276M/892M [00:08<00:20, 30.1MB/s]

model.safetensors: 32% 282M/892M [00:08<00:17, 34.5MB/s]

optimizer.pt: 16% 294M/1.78G [00:08<00:39, 37.4MB/s]

optimizer.pt: 17% 301M/1.78G [00:08<00:34, 43.0MB/s]

model.safetensors: 27% 244M/892M [00:07<00:18, 34.7MB/s]

model.safetensors: 32% 288M/892M [00:08<00:22, 27.3MB/s]

model.safetensors: 34% 300M/892M [00:08<00:13, 42.9MB/s]

optimizer.pt: 17% 312M/1.78G [00:08<00:39, 36.8MB/s]

model.safetensors: 29% 262M/892M [00:07<00:16, 38.4MB/s]

model.safetensors: 34% 306M/892M [00:08<00:16, 35.2MB/s]

model.safetensors: 35% 313M/892M [00:08<00:14, 39.8MB/s]

optimizer.pt: 18% 325M/1.78G [00:09<00:42, 34.5MB/s]

model.safetensors: 31% 275M/892M [00:08<00:15, 39.0MB/s]

optimizer.pt: 19% 331M/1.78G [00:09<00:36, 39.6MB/s]

model.safetensors: 36% 320M/892M [00:09<00:17, 32.5MB/s]

model.safetensors: 37% 329M/892M [00:09<00:13, 41.9MB/s]

optimizer.pt: 19% 336M/1.78G [00:09<00:44, 32.5MB/s]

model.safetensors: 33% 290M/892M [00:08<00:16, 36.7MB/s]

optimizer.pt: 19% 344M/1.78G [00:09<00:35, 40.7MB/s]

model.safetensors: 33% 296M/892M [00:08<00:15, 37.5MB/s]

optimizer.pt: 20% 350M/1.78G [00:09<00:33, 43.0MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
model.safetensors: 38% 336M/892M [00:09<00:16, 34.6MB/s]

model.safetensors: 39% 344M/892M [00:09<00:13, 40.6MB/s]

model.safetensors: 39% 350M/892M [00:09<00:12, 44.7MB/s]

optimizer.pt: 20% 363M/1.78G [00:09<00:32, 43.2MB/s]

model.safetensors: 35% 308M/892M [00:09<00:18, 31.1MB/s]

model.safetensors: 40% 356M/892M [00:10<00:14, 36.4MB/s]

model.safetensors: 41% 363M/892M [00:10<00:12, 43.2MB/s]

model.safetensors: 41% 368M/892M [00:10<00:15, 33.7MB/s]

optimizer.pt: 22% 384M/1.78G [00:10<00:37, 36.9MB/s]

model.safetensors: 43% 382M/892M [00:10<00:09, 51.9MB/s]

model.safetensors: 37% 327M/892M [00:09<00:20, 27.3MB/s]

optimizer.pt: 22% 391M/1.78G [00:10<00:34, 40.6MB/s]

model.safetensors: 37% 332M/892M [00:09<00:18, 30.6MB/s]

model.safetensors: 44% 389M/892M [00:10<00:13, 37.6MB/s]

model.safetensors: 38% 336M/892M [00:10<00:20, 27.1MB/s]

optimizer.pt: 23% 403M/1.78G [00:10<00:37, 36.6MB/s]

model.safetensors: 38% 343M/892M [00:10<00:15, 34.8MB/s]

optimizer.pt: 23% 407M/1.78G [00:11<00:35, 38.5MB/s]

model.safetensors: 39% 350M/892M [00:10<00:13, 40.0MB/s]

model.safetensors: 46% 411M/892M [00:11<00:10, 45.1MB/s]

optimizer.pt: 23% 419M/1.78G [00:11<00:38, 35.8MB/s]

optimizer.pt: 24% 428M/1.78G [00:11<00:29, 46.6MB/s]

model.safetensors: 47% 417M/892M [00:11<00:11, 40.7MB/s]

model.safetensors: 47% 423M/892M [00:11<00:10, 43.9MB/s]

optimizer.pt: 24% 433M/1.78G [00:11<00:36, 37.5MB/s]

model.safetensors: 48% 431M/892M [00:11<00:09, 47.9MB/s]

optimizer.pt: 25% 439M/1.78G [00:11<00:33, 39.6MB/s]

model.safetensors: 49% 436M/892M [00:12<00:12, 37.6MB/s]

model.safetensors: 50% 445M/892M [00:12<00:10, 44.3MB/s]

optimizer.pt: 25% 448M/1.78G [00:12<00:38, 34.7MB/s]

model.safetensors: 43% 382M/892M [00:11<00:12, 39.7MB/s]

model.safetensors: 50% 450M/892M [00:12<00:11, 38.9MB/s]

model.safetensors: 51% 457M/892M [00:12<00:09, 44.5MB/s]

model.safetensors: 52% 464M/892M [00:12<00:08, 47.6MB/s]

optimizer.pt: 26% 464M/1.78G [00:12<00:36, 36.4MB/s]

model.safetensors: 45% 397M/892M [00:11<00:12, 38.9MB/s]

model.safetensors: 54% 478M/892M [00:12<00:08, 48.0MB/s]

optimizer.pt: 27% 480M/1.78G [00:12<00:33, 38.8MB/s]

model.safetensors: 45% 401M/892M [00:12<00:20, 23.4MB/s]

optimizer.pt: 28% 494M/1.78G [00:13<00:22, 56.3MB/s]

model.safetensors: 54% 484M/892M [00:13<00:11, 35.4MB/s]

model.safetensors: 55% 489M/892M [00:13<00:10, 36.8MB/s]

model.safetensors: 55% 494M/892M [00:13<00:09, 40.2MB/s]

model.safetensors: 47% 416M/892M [00:12<00:17, 27.4MB/s]

model.safetensors: 56% 499M/892M [00:13<00:11, 35.0MB/s]

model.safetensors: 56% 503M/892M [00:13<00:10, 35.7MB/s]

optimizer.pt: 29% 519M/1.78G [00:13<00:30, 41.4MB/s]

model.safetensors: 57% 510M/892M [00:13<00:09, 41.7MB/s]

optimizer.pt: 30% 526M/1.78G [00:13<00:29, 43.2MB/s]

model.safetensors: 58% 515M/892M [00:13<00:09, 38.8MB/s]

model.safetensors: 50% 446M/892M [00:13<00:10, 42.4MB/s]

model.safetensors: 58% 519M/892M [00:14<00:09, 38.2MB/s]

model.safetensors: 59% 524M/892M [00:14<00:09, 38.0MB/s]

model.safetensors: 51% 451M/892M [00:13<00:12, 36.7MB/s]

model.safetensors: 51% 456M/892M [00:13<00:11, 36.6MB/s]

model.safetensors: 59% 528M/892M [00:14<00:12, 28.4MB/s]

model.safetensors: 60% 536M/892M [00:14<00:09, 37.8MB/s]

model.safetensors: 61% 542M/892M [00:14<00:08, 42.2MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK

model.safetensors: 53% 472M/892M [00:13<00:10, 39.9MB/s]

optimizer.pt: 31% 561M/1.78G [00:14<00:33, 37.0MB/s]INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK

model.safetensors: 62% 553M/892M [00:15<00:09, 36.8MB/s]

model.safetensors: 54% 480M/892M [00:14<00:12, 33.8MB/s]

model.safetensors: 55% 489M/892M [00:14<00:09, 42.9MB/s]

optimizer.pt: 32% 576M/1.78G [00:15<00:32, 36.8MB/s]

model.safetensors: 63% 560M/892M [00:15<00:11, 27.9MB/s]

model.safetensors: 56% 496M/892M [00:14<00:10, 37.6MB/s]

model.safetensors: 57% 505M/892M [00:14<00:08, 45.8MB/s]

model.safetensors: 64% 571M/892M [00:15<00:08, 38.3MB/s]

model.safetensors: 57% 511M/892M [00:14<00:08, 45.6MB/s]

model.safetensors: 65% 576M/892M [00:15<00:10, 30.9MB/s]

model.safetensors: 66% 586M/892M [00:15<00:07, 41.9MB/s]

model.safetensors: 59% 523M/892M [00:15<00:08, 41.8MB/s]

model.safetensors: 66% 592M/892M [00:16<00:08, 35.7MB/s]

optimizer.pt: 35% 621M/1.78G [00:16<00:24, 46.8MB/s]

model.safetensors: 67% 599M/892M [00:16<00:07, 41.6MB/s]

model.safetensors: 68% 606M/892M [00:16<00:06, 45.2MB/s]

optimizer.pt: 35% 628M/1.78G [00:16<00:30, 37.7MB/s]

model.safetensors: 69% 612M/892M [00:16<00:06, 40.9MB/s]

model.safetensors: 69% 616M/892M [00:16<00:06, 42.2MB/s]

model.safetensors: 70% 624M/892M [00:16<00:06, 39.4MB/s]

model.safetensors: 72% 639M/892M [00:17<00:04, 52.9MB/s]

optimizer.pt: 37% 657M/1.78G [00:17<00:29, 38.7MB/s]

model.safetensors: 72% 645M/892M [00:17<00:05, 43.7MB/s]

optimizer.pt: 38% 669M/1.78G [00:17<00:22, 49.2MB/s]

model.safetensors: 73% 650M/892M [00:17<00:05, 40.9MB/s]

optimizer.pt: 38% 675M/1.78G [00:17<00:25, 42.8MB/s]

model.safetensors: 74% 656M/892M [00:17<00:06, 36.6MB/s]

model.safetensors: 74% 663M/892M [00:17<00:05, 44.0MB/s]

model.safetensors: 63% 560M/892M [00:16<00:15, 21.6MB/s]

model.safetensors: 75% 669M/892M [00:17<00:05, 44.5MB/s]

model.safetensors: 64% 567M/892M [00:16<00:12, 26.8MB/s]

model.safetensors: 65% 576M/892M [00:17<00:08, 37.4MB/s]

model.safetensors: 76% 681M/892M [00:18<00:04, 46.4MB/s]

model.safetensors: 77% 686M/892M [00:18<00:04, 47.3MB/s]

model.safetensors: 78% 691M/892M [00:18<00:04, 41.6MB/s]

model.safetensors: 65% 581M/892M [00:17<00:13, 22.3MB/s]

model.safetensors: 79% 700M/892M [00:18<00:04, 43.0MB/s]

optimizer.pt: 40% 717M/1.78G [00:18<00:26, 39.9MB/s]

model.safetensors: 66% 585M/892M [00:17<00:13, 23.5MB/s]

model.safetensors: 66% 591M/892M [00:17<00:10, 28.2MB/s]

model.safetensors: 79% 705M/892M [00:18<00:06, 29.3MB/s]

model.safetensors: 67% 595M/892M [00:18<00:11, 26.0MB/s]

model.safetensors: 80% 711M/892M [00:19<00:05, 35.2MB/s]

model.safetensors: 80% 715M/892M [00:19<00:04, 36.6MB/s]

optimizer.pt: 41% 734M/1.78G [00:19<00:26, 39.4MB/s]

model.safetensors: 68% 605M/892M [00:18<00:09, 31.6MB/s]

model.safetensors: 81% 720M/892M [00:19<00:06, 26.5MB/s]

model.safetensors: 68% 609M/892M [00:18<00:10, 26.6MB/s]

optimizer.pt: 42% 749M/1.78G [00:19<00:24, 41.8MB/s]

model.safetensors: 82% 727M/892M [00:19<00:05, 32.5MB/s]> INFO Running jobs: [7899]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK

model.safetensors: 82% 733M/892M [00:19<00:04, 35.3MB/s]

optimizer.pt: 42% 754M/1.78G [00:19<00:35, 29.3MB/s]

model.safetensors: 83% 737M/892M [00:19<00:05, 29.9MB/s]

model.safetensors: 83% 742M/892M [00:20<00:04, 34.3MB/s]

model.safetensors: 84% 748M/892M [00:20<00:03, 38.1MB/s]

model.safetensors: 71% 637M/892M [00:19<00:06, 38.2MB/s]

optimizer.pt: 43% 770M/1.78G [00:20<00:30, 33.0MB/s]

model.safetensors: 84% 752M/892M [00:20<00:04, 28.8MB/s]

model.safetensors: 85% 760M/892M [00:20<00:03, 39.1MB/s]

model.safetensors: 86% 766M/892M [00:20<00:03, 41.7MB/s]

model.safetensors: 73% 654M/892M [00:19<00:05, 40.2MB/s]

model.safetensors: 86% 771M/892M [00:20<00:03, 37.9MB/s]

model.safetensors: 87% 777M/892M [00:20<00:02, 42.4MB/s]

model.safetensors: 88% 782M/892M [00:20<00:02, 43.7MB/s]

model.safetensors: 74% 664M/892M [00:20<00:06, 34.3MB/s]

model.safetensors: 75% 672M/892M [00:20<00:05, 43.8MB/s]

model.safetensors: 88% 787M/892M [00:21<00:03, 34.4MB/s]

model.safetensors: 89% 793M/892M [00:21<00:02, 41.9MB/s]

model.safetensors: 90% 799M/892M [00:21<00:02, 44.6MB/s]

optimizer.pt: 46% 819M/1.78G [00:21<00:26, 36.9MB/s]

model.safetensors: 77% 685M/892M [00:20<00:05, 38.7MB/s]

optimizer.pt: 46% 825M/1.78G [00:21<00:24, 39.8MB/s]

model.safetensors: 77% 690M/892M [00:20<00:06, 33.2MB/s]

model.safetensors: 79% 702M/892M [00:20<00:03, 50.1MB/s]

optimizer.pt: 47% 832M/1.78G [00:21<00:28, 33.1MB/s]

optimizer.pt: 47% 846M/1.78G [00:21<00:18, 50.3MB/s]

model.safetensors: 79% 709M/892M [00:21<00:04, 42.4MB/s]

model.safetensors: 81% 720M/892M [00:21<00:03, 56.4MB/s]

optimizer.pt: 48% 853M/1.78G [00:22<00:21, 42.9MB/s]

optimizer.pt: 48% 864M/1.78G [00:22<00:23, 39.3MB/s]

model.safetensors: 82% 727M/892M [00:21<00:04, 34.0MB/s]

optimizer.pt: 49% 873M/1.78G [00:22<00:19, 45.6MB/s]

model.safetensors: 82% 733M/892M [00:21<00:04, 36.3MB/s]

model.safetensors: 91% 816M/892M [00:22<00:04, 18.3MB/s]

model.safetensors: 83% 738M/892M [00:22<00:05, 29.4MB/s]

optimizer.pt: 50% 885M/1.78G [00:22<00:22, 39.9MB/s]

model.safetensors: 83% 744M/892M [00:22<00:04, 34.3MB/s]

optimizer.pt: 50% 891M/1.78G [00:23<00:21, 41.9MB/s]

model.safetensors: 92% 822M/892M [00:23<00:03, 18.0MB/s]

optimizer.pt: 50% 896M/1.78G [00:23<00:26, 33.0MB/s]

model.safetensors: 93% 832M/892M [00:23<00:02, 26.5MB/s]

optimizer.pt: 51% 902M/1.78G [00:23<00:23, 37.4MB/s]

model.safetensors: 85% 759M/892M [00:22<00:03, 35.1MB/s]

optimizer.pt: 51% 909M/1.78G [00:23<00:20, 42.1MB/s]

model.safetensors: 94% 838M/892M [00:23<00:02, 24.7MB/s]

optimizer.pt: 51% 914M/1.78G [00:23<00:26, 32.4MB/s]

optimizer.pt: 52% 927M/1.78G [00:23<00:17, 49.7MB/s]

model.safetensors: 95% 848M/892M [00:24<00:01, 26.0MB/s]

model.safetensors: 96% 857M/892M [00:24<00:01, 32.1MB/s]

model.safetensors: 97% 862M/892M [00:24<00:00, 34.4MB/s]

optimizer.pt: 53% 938M/1.78G [00:24<00:21, 38.9MB/s]

model.safetensors: 88% 784M/892M [00:23<00:03, 29.1MB/s]

model.safetensors: 89% 793M/892M [00:23<00:02, 38.7MB/s]

optimizer.pt: 53% 944M/1.78G [00:24<00:24, 33.8MB/s]

optimizer.pt: 54% 957M/1.78G [00:24<00:16, 51.6MB/s]

model.safetensors: 97% 867M/892M [00:24<00:01, 22.0MB/s]

model.safetensors: 98% 873M/892M [00:24<00:00, 26.1MB/s]

model.safetensors: 91% 814M/892M [00:24<00:01, 44.3MB/s]

model.safetensors: 99% 879M/892M [00:25<00:00, 29.1MB/s]

optimizer.pt: 54% 970M/1.78G [00:25<00:19, 41.3MB/s]

model.safetensors: 100% 889M/892M [00:25<00:00, 32.8MB/s]

model.safetensors: 93% 827M/892M [00:24<00:01, 41.4MB/s]

optimizer.pt: 55% 976M/1.78G [00:25<00:22, 35.3MB/s]

optimizer.pt: 55% 983M/1.78G [00:25<00:20, 39.6MB/s]

model.safetensors: 100% 892M/892M [00:25<00:00, 34.8MB/s]

model.safetensors: 94% 842M/892M [00:24<00:01, 48.6MB/s]

optimizer.pt: 56% 992M/1.78G [00:25<00:24, 32.6MB/s]

model.safetensors: 95% 848M/892M [00:24<00:01, 40.8MB/s]

optimizer.pt: 56% 1.00G/1.78G [00:25<00:18, 42.0MB/s]

model.safetensors: 96% 856M/892M [00:25<00:00, 48.8MB/s]

Upload 11 LFS files: 9% 1/11 [00:25<04:19, 25.95s/it]

optimizer.pt: 56% 1.01G/1.78G [00:26<00:17, 44.3MB/s]

model.safetensors: 97% 862M/892M [00:25<00:00, 49.3MB/s]

optimizer.pt: 57% 1.01G/1.78G [00:26<00:19, 39.1MB/s]

model.safetensors: 97% 868M/892M [00:25<00:00, 41.6MB/s]

optimizer.pt: 57% 1.02G/1.78G [00:26<00:17, 42.6MB/s]

model.safetensors: 98% 874M/892M [00:25<00:00, 44.6MB/s]

model.safetensors: 99% 880M/892M [00:25<00:00, 37.8MB/s]

optimizer.pt: 57% 1.02G/1.78G [00:26<00:22, 33.6MB/s]

model.safetensors: 100% 889M/892M [00:25<00:00, 49.3MB/s]

optimizer.pt: 58% 1.03G/1.78G [00:26<00:18, 40.2MB/s]

model.safetensors: 100% 892M/892M [00:25<00:00, 34.3MB/s]

optimizer.pt: 59% 1.05G/1.78G [00:27<00:18, 39.9MB/s]

optimizer.pt: 59% 1.06G/1.78G [00:27<00:21, 33.9MB/s]

optimizer.pt: 60% 1.07G/1.78G [00:27<00:14, 48.8MB/s]

optimizer.pt: 60% 1.08G/1.78G [00:27<00:18, 38.0MB/s]

optimizer.pt: 61% 1.09G/1.78G [00:27<00:14, 49.0MB/s]

optimizer.pt: 61% 1.09G/1.78G [00:28<00:16, 42.4MB/s]

optimizer.pt: 62% 1.10G/1.78G [00:28<00:16, 41.0MB/s]

optimizer.pt: 63% 1.12G/1.78G [00:28<00:11, 56.8MB/s]

optimizer.pt: 63% 1.13G/1.78G [00:29<00:19, 33.8MB/s]

optimizer.pt: 64% 1.14G/1.78G [00:29<00:18, 35.4MB/s]

optimizer.pt: 65% 1.15G/1.78G [00:29<00:12, 49.1MB/s]

optimizer.pt: 65% 1.16G/1.78G [00:29<00:13, 46.1MB/s]

optimizer.pt: 65% 1.17G/1.78G [00:30<00:17, 35.1MB/s]

optimizer.pt: 66% 1.18G/1.78G [00:30<00:12, 48.2MB/s]

optimizer.pt: 67% 1.19G/1.78G [00:30<00:14, 41.5MB/s]

optimizer.pt: 67% 1.20G/1.78G [00:30<00:16, 36.1MB/s]

optimizer.pt: 68% 1.21G/1.78G [00:30<00:11, 49.6MB/s]

optimizer.pt: 69% 1.22G/1.78G [00:31<00:15, 35.9MB/s]

optimizer.pt: 69% 1.23G/1.78G [00:31<00:18, 29.2MB/s]

optimizer.pt: 70% 1.25G/1.78G [00:31<00:12, 41.6MB/s]

optimizer.pt: 70% 1.25G/1.78G [00:32<00:13, 38.2MB/s]

optimizer.pt: 71% 1.26G/1.78G [00:32<00:12, 40.6MB/s]

optimizer.pt: 72% 1.28G/1.78G [00:32<00:09, 53.3MB/s]

optimizer.pt: 72% 1.28G/1.78G [00:32<00:10, 47.6MB/s]

optimizer.pt: 73% 1.30G/1.78G [00:33<00:11, 42.7MB/s]

optimizer.pt: 73% 1.31G/1.78G [00:33<00:08, 56.5MB/s]

optimizer.pt: 74% 1.32G/1.78G [00:33<00:09, 46.9MB/s]

optimizer.pt: 74% 1.33G/1.78G [00:33<00:14, 31.7MB/s]

optimizer.pt: 75% 1.34G/1.78G [00:34<00:09, 44.1MB/s]

optimizer.pt: 76% 1.35G/1.78G [00:34<00:12, 34.9MB/s]

optimizer.pt: 76% 1.36G/1.78G [00:34<00:11, 35.3MB/s]

optimizer.pt: 77% 1.37G/1.78G [00:34<00:08, 47.7MB/s]

optimizer.pt: 77% 1.38G/1.78G [00:35<00:09, 41.2MB/s]

optimizer.pt: 78% 1.39G/1.78G [00:35<00:12, 30.8MB/s]

optimizer.pt: 79% 1.41G/1.78G [00:35<00:08, 43.1MB/s]

optimizer.pt: 79% 1.41G/1.78G [00:36<00:09, 40.6MB/s]

optimizer.pt: 80% 1.42G/1.78G [00:36<00:10, 35.8MB/s]

optimizer.pt: 81% 1.44G/1.78G [00:36<00:07, 49.1MB/s]

optimizer.pt: 81% 1.45G/1.78G [00:36<00:07, 42.8MB/s]

optimizer.pt: 82% 1.46G/1.78G [00:37<00:08, 39.9MB/s]

optimizer.pt: 82% 1.47G/1.78G [00:37<00:05, 53.8MB/s]

optimizer.pt: 83% 1.48G/1.78G [00:37<00:07, 43.4MB/s]

optimizer.pt: 83% 1.49G/1.78G [00:37<00:07, 38.8MB/s]

optimizer.pt: 84% 1.50G/1.78G [00:37<00:05, 52.5MB/s]

optimizer.pt: 85% 1.51G/1.78G [00:38<00:07, 38.2MB/s]

optimizer.pt: 85% 1.52G/1.78G [00:38<00:07, 35.6MB/s]

optimizer.pt: 86% 1.53G/1.78G [00:38<00:05, 48.5MB/s]

optimizer.pt: 86% 1.54G/1.78G [00:39<00:06, 39.2MB/s]

optimizer.pt: 87% 1.55G/1.78G [00:39<00:05, 38.9MB/s]

optimizer.pt: 88% 1.57G/1.78G [00:39<00:04, 51.7MB/s]

optimizer.pt: 88% 1.57G/1.78G [00:39<00:04, 48.2MB/s]

optimizer.pt: 89% 1.58G/1.78G [00:39<00:04, 40.3MB/s]

optimizer.pt: 90% 1.60G/1.78G [00:40<00:03, 54.1MB/s]

optimizer.pt: 90% 1.61G/1.78G [00:40<00:03, 47.2MB/s]

optimizer.pt: 91% 1.62G/1.78G [00:40<00:04, 36.7MB/s]

optimizer.pt: 91% 1.63G/1.78G [00:40<00:03, 49.6MB/s]

optimizer.pt: 92% 1.64G/1.78G [00:41<00:04, 31.0MB/s]

optimizer.pt: 92% 1.65G/1.78G [00:41<00:04, 30.7MB/s]

optimizer.pt: 93% 1.66G/1.78G [00:41<00:02, 42.7MB/s]

optimizer.pt: 94% 1.67G/1.78G [00:42<00:02, 39.4MB/s]

optimizer.pt: 94% 1.68G/1.78G [00:42<00:02, 47.8MB/s]

optimizer.pt: 95% 1.69G/1.78G [00:42<00:02, 47.7MB/s]

optimizer.pt: 95% 1.70G/1.78G [00:42<00:02, 41.4MB/s]

optimizer.pt: 96% 1.71G/1.78G [00:42<00:01, 56.8MB/s]

optimizer.pt: 96% 1.72G/1.78G [00:43<00:01, 47.8MB/s]

optimizer.pt: 97% 1.73G/1.78G [00:43<00:01, 40.4MB/s]

optimizer.pt: 98% 1.74G/1.78G [00:43<00:00, 54.0MB/s]

optimizer.pt: 98% 1.75G/1.78G [00:43<00:00, 46.2MB/s]

optimizer.pt: 99% 1.76G/1.78G [00:44<00:00, 43.0MB/s]

optimizer.pt: 99% 1.77G/1.78G [00:44<00:00, 56.2MB/s]

optimizer.pt: 100% 1.78G/1.78G [00:44<00:00, 39.8MB/s]

Upload 11 LFS files: 100% 11/11 [00:45<00:00, 4.11s/it]
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK

INFO Running jobs: [7899]
INFO Killing PID: 7899
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /accelerators HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK
INFO Running jobs: []
INFO: 2a02:14f:1f5:eba7:6821:2e7d:a0aa:87b4:0 - "GET /is_model_training HTTP/1.1" 200 OK

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

Tried several things in a fork https://github.com/tombenj/autotrain-advanced/commits/length/ such as adding max_new_tokens to the model generation:
tombenj@c9741b9

As suggested here:
https://www.markhneedham.com/blog/2023/06/19/huggingface-max-length-generation-length-deprecated/

But still getting cut out 20 length token responses:
Screen Shot 2024-03-14 at 13 58 12

@abhishekkrthakur can you point to a direction how to resolve this?

from autotrain-advanced.

abhishekkrthakur avatar abhishekkrthakur commented on September 22, 2024

ohh those are the default parameters. you can change the default params: https://huggingface.co/docs/hub/models-widgets#example-outputs

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

@abhishekkrthakur it has nothing to do with the default params. training facebook's Bart results in good output, training t5's give max 20 token lengths.

from autotrain-advanced.

abhishekkrthakur avatar abhishekkrthakur commented on September 22, 2024

can you share the trained model repo?

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

@abhishekkrthakur yep here is an example:
https://huggingface.co/tombenj/tuple-1k-t5

Getting only 20 tokens as output.

from autotrain-advanced.

abhishekkrthakur avatar abhishekkrthakur commented on September 22, 2024

changing params here have no effect: https://huggingface.co/tombenj/tuple-1k-t5/blob/main/config.json#L29 ?

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

@abhishekkrthakur changed here and getting the same max 20 token output:
https://huggingface.co/tombenj/tuple-1k-t5/commit/0868248619d5a457bc52a13af26af94d93a436b1
https://huggingface.co/tombenj/tuple-1k-t5/commit/6823fe355c7fd90a9fd0bfa6b72e8784bebb0b16

from autotrain-advanced.

tombenj avatar tombenj commented on September 22, 2024

@abhishekkrthakur any updates on this?

from autotrain-advanced.

github-actions avatar github-actions commented on September 22, 2024

This issue is stale because it has been open for 15 days with no activity.

from autotrain-advanced.

github-actions avatar github-actions commented on September 22, 2024

This issue was closed because it has been inactive for 2 days since being marked as stale.

from autotrain-advanced.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.