GithubHelp home page GithubHelp logo

Comments (8)

nicolas-ivanov avatar nicolas-ivanov commented on August 26, 2024

@Spring1212, the size of vocabulary that you are trying to use is only 26 tokens - not good. Apparently you rewrote the default vocabulary file by running prepare_index_files.py. See the corresponding section of our Readme for a fix guideline: https://github.com/lukalabs/cakechat#training-the-model-from-scratch

from cakechat.

Spring12111 avatar Spring12111 commented on August 26, 2024

Is this due to too little training data?

from cakechat.

Spring12111 avatar Spring12111 commented on August 26, 2024

Can adding training data avoid this problem?

from cakechat.

nicolas-ivanov avatar nicolas-ivanov commented on August 26, 2024

Is this due to too little training data?

You got a different index files because apparently you run prepare_index_files.py and it rewrote the original index files. Even if you had training data in abundance, running this script would still result in overwriting index files, so won't be able to use our pretrained model.

If you just want to use our pretrained model without finetuning on you own data, you don't need to run prepare_index_files.py script. Restore the default index files following the link I send you in the previous message and run the server with our pretrained model after that.

And in case you do want to tune our model or to train your own from scratch, please, read carefully the corresponding sections from our Readme: https://github.com/lukalabs/cakechat#training-the-model

from cakechat.

Spring12111 avatar Spring12111 commented on August 26, 2024

According to the self-report file training model from scratch, every step of adding training data into the file is operated.
AZZJT LIW5_4V$RH4WAHY`2

from cakechat.

Spring12111 avatar Spring12111 commented on August 26, 2024

Is this still a pre-training model?

from cakechat.

nicolas-ivanov avatar nicolas-ivanov commented on August 26, 2024

Seems like I got the problem: the server is trying to use a reverse model for sampling_reranking mode of generating answers and, I assume, you didn't train it, since it's not mentioned in this section of our Readme.

Option 1: change sampling_reranking to sampling here: https://github.com/lukalabs/cakechat/blob/master/cakechat/api/config.py#L4

Option 2: run python tools/train.py -r to train the reverse model.

In either case, please, note that you'll need to use a larger training set with larger vocabulary to get any meaningful results after training.

from cakechat.

stale avatar stale commented on August 26, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

from cakechat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.