GithubHelp home page GithubHelp logo

Comments (7)

jackshiwl avatar jackshiwl commented on May 27, 2024

@abhishekkrthakur able to give insight on this? seems like a major bug..

does it mean increasing the model_max_length (or block_size) while keeping the data length the same will affect the fine tuning process?

from autotrain-advanced.

jackshiwl avatar jackshiwl commented on May 27, 2024

@abhishekkrthakur sorry, any insights on this?

It seems like when I increase the block-size / model_max_length during fine-tuning to be much greater than the input token length, the model is not able to learn anymore from the fine-tuning (even though its severely overfitted).

from autotrain-advanced.

abhishekkrthakur avatar abhishekkrthakur commented on May 27, 2024

please be patient @jackshiwl . many times, immediate response is not possible :)
if your sentences are small and you are using large max len, it means there will be too many padding tokens, which may account for the model not learning properly. given your data, you should choose the best hyperparameters suitable for the model you are training. this is not a bug.

from autotrain-advanced.

jackshiwl avatar jackshiwl commented on May 27, 2024

@abhishekkrthakur,

  1. for the padding args, I have it set as default, so it is 'none'. but it doesnt work if padding=right / padding = left.
  2. Even if there are paddings, I have tried to overfit it severely, by setting lots of epochs, etc. The loss goes down to abysmally small value, but it is still not able to recall the sample dataset (there is only 1 sample x42 times)
  3. i have begin testing from 1024, 2048, ... it all works. But once it hits 4096, it just totally stops recalling, even if I tried to increase the epochs drastically etc.

I am just worried if this is an issue if I use padding=none (default) in my finetuning process? because i have samples that are about 500 tokens, but some are also 4096 (all are trimmed to 4096 max). not sure if this will be a problem for fine-tuning. do you use padding for your own finetuning?

from autotrain-advanced.

jackshiwl avatar jackshiwl commented on May 27, 2024

would appreciate if you can elaborate a little on what padding sides do you use for your own finetuning - and also for inferencing.

from autotrain-advanced.

github-actions avatar github-actions commented on May 27, 2024

This issue is stale because it has been open for 15 days with no activity.

from autotrain-advanced.

github-actions avatar github-actions commented on May 27, 2024

This issue was closed because it has been inactive for 2 days since being marked as stale.

from autotrain-advanced.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.