GithubHelp home page GithubHelp logo

Comments (6)

Anthony-Sun-S avatar Anthony-Sun-S commented on May 30, 2024

ernie训练使用的是applications/neural_search/ranking/ernie_matching这个

from paddlenlp.

w5688414 avatar w5688414 commented on May 30, 2024

请问你之前用的是什么模型融合的方法.

Paddlenlp目前没有开源模型融合的技术,欢迎开发者贡献!

from paddlenlp.

Anthony-Sun-S avatar Anthony-Sun-S commented on May 30, 2024

请问你之前用的是什么模型融合的方法.

Paddlenlp目前没有开源模型融合的技术,欢迎开发者贡献!

使用的是BAAI的LM_Cocktail对bge进行合并,那请问您这边有遇到过我这种程序突然终止的情况吗?没有报错信息,本来是在正常训练的,然后突然进程就结束了,只能看到一个pod failed,然后exit code不是-9就是-15,这种情况一直没能解决就有一段时间没有训练ernie了,数据量小的话是没问题的,我用的微软的mMarco就不行

from paddlenlp.

w5688414 avatar w5688414 commented on May 30, 2024

目前模型融合没有相应的开发计划,可以使用python的pdb打断点进行调试,或者提供一下最小复现代码。

from paddlenlp.

Anthony-Sun-S avatar Anthony-Sun-S commented on May 30, 2024

目前模型融合没有相应的开发计划,可以使用python的pdb打断点进行调试,或者提供一下最小复现代码。

代码几乎没有什么改动,只是把学习率策略改成了cos这个
lr_scheduler = CosineDecayWithWarmup( learning_rate=args.learning_rate, warmup=warmup_step, total_steps=num_training_steps, with_hard_restarts=True, num_cycles=100.0, last_epoch=-1, verbose=False )
但是我试过不做任何修改训练数据量大的时候也会出现这个问题,镜像也是从docker hub拉取的paddlepaddle的镜像

from paddlenlp.

w5688414 avatar w5688414 commented on May 30, 2024

可以看一下您的显存或者内存是否足够,如果是这个原因,可以调小batch_size,使用轻量化的小模型来解决

如果还有问题,则需要提供最小复现的代码和数据,方便我们定位原因

from paddlenlp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.