GithubHelp home page GithubHelp logo

Comments (9)

jekbradbury avatar jekbradbury commented on May 12, 2024 1

I don't think it does, but I also haven't run any comparison tests

from opennmt-py.

magic282 avatar magic282 commented on May 12, 2024 1

@vene But with option -extra-shuffle, I guess things will be different.

from opennmt-py.

nelson-liu avatar nelson-liu commented on May 12, 2024 1

Anecdotally speaking, I ran an informal comparison and it made almost no difference, since as @vene said my dataset was large enough and the batch size was small enough that the majority of batches had no padding.

from opennmt-py.

donglixp avatar donglixp commented on May 12, 2024

The mask is used in: https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/Translator.py#L130

from opennmt-py.

magic282 avatar magic282 commented on May 12, 2024

It seems that this apply mask is not used during training.

from opennmt-py.

donglixp avatar donglixp commented on May 12, 2024

@magic282 There's no mask in during training in the implementation. I'm not sure whether it would make a huge difference.

from opennmt-py.

vene avatar vene commented on May 12, 2024

I assumed since the sentences are sorted by length, with small enough batches and large enough datasets, training batches will be fully filled out? Now I'm not sure anymore...

from opennmt-py.

vene avatar vene commented on May 12, 2024

Thanks for checking @nelson-liu, that makes sense!

I wonder if skipping the masking really saves a lot of time during training. With -extra-shuffle it indeed seems like this is a bug, as @magic282 points out. Even with sorted batches, and with a huge number of sentences for each length bin, there will be some unfortunate batches with one sentence of length d+1 and N-1 sentences of length d, where the code does not correctly reflect the intended model, then.

from opennmt-py.

vince62s avatar vince62s commented on May 12, 2024

old thread, if someone is motivated to implement, just reopen.

from opennmt-py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.