Comments (7)
@abhishekkrthakur able to give insight on this? seems like a major bug..
does it mean increasing the model_max_length (or block_size) while keeping the data length the same will affect the fine tuning process?
from autotrain-advanced.
@abhishekkrthakur sorry, any insights on this?
It seems like when I increase the block-size / model_max_length during fine-tuning to be much greater than the input token length, the model is not able to learn anymore from the fine-tuning (even though its severely overfitted).
from autotrain-advanced.
please be patient @jackshiwl . many times, immediate response is not possible :)
if your sentences are small and you are using large max len, it means there will be too many padding tokens, which may account for the model not learning properly. given your data, you should choose the best hyperparameters suitable for the model you are training. this is not a bug.
from autotrain-advanced.
- for the padding args, I have it set as default, so it is 'none'. but it doesnt work if padding=right / padding = left.
- Even if there are paddings, I have tried to overfit it severely, by setting lots of epochs, etc. The loss goes down to abysmally small value, but it is still not able to recall the sample dataset (there is only 1 sample x42 times)
- i have begin testing from 1024, 2048, ... it all works. But once it hits 4096, it just totally stops recalling, even if I tried to increase the epochs drastically etc.
I am just worried if this is an issue if I use padding=none (default) in my finetuning process? because i have samples that are about 500 tokens, but some are also 4096 (all are trimmed to 4096 max). not sure if this will be a problem for fine-tuning. do you use padding for your own finetuning?
from autotrain-advanced.
would appreciate if you can elaborate a little on what padding sides do you use for your own finetuning - and also for inferencing.
from autotrain-advanced.
This issue is stale because it has been open for 15 days with no activity.
from autotrain-advanced.
This issue was closed because it has been inactive for 2 days since being marked as stale.
from autotrain-advanced.
Related Issues (20)
- [BUG] Inference times out even though model finetuning finished successfully HOT 1
- start train failed HOT 1
- [BUG] autotrain app fails on start HOT 4
- Getting [Errno 13] Permission denied | autotrain.cli.run_dreambooth:run:393 - Job ID: 33508 | autotrain.trainers.common:wrapper:121 - [Errno 13] Permission denied: HOT 4
- [BUG] Valid split will trigger AttributeError: 'NoneType' object has no attribute 'map' HOT 1
- [BUG] No UI available HOT 8
- [FEATURE REQUEST]: Supporting Audio Tasks HOT 2
- [FEATURE REQUEST] Extend AutoTrain API with Start Training Request
- [FEATURE REQUEST] Add W&B Logging to AutoTrain UI and API
- Training process crashes suddenly HOT 1
- Autotrain CLI suddenly crashes HOT 3
- [BUG] Config name is missing for Datasets with no default config HOT 1
- [BUG] AutoTrain_LLM.ipynb quantization = none
- [BUG] Error when running autotrain CLI from huggingface - unrecognized arguments HOT 1
- [Push to Hub fails for local data paths] HOT 9
- [BUG] Distant computer (4 GPUs 10GiB of VRAM each) crashes the second i launch the finetuning using AutoTrain localy of mistral 7B HOT 8
- [BUG] Invalid YAML in README.md error after fine tuning is completed HOT 1
- [BUG] Autotrain fails due to missing metadata.jsonl, after removing my metadata.jsonl file. HOT 1
- [BUG] ObjectDetectionParams for uploaded Archive.zip has null data_path HOT 4
- [BUG] Autotrain Object Detection Error: KeyError: 'autotrain_label' HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autotrain-advanced.