GithubHelp home page GithubHelp logo

Unable to predict gap tags about openkiwi HOT 5 CLOSED

unbabel avatar unbabel commented on September 5, 2024
Unable to predict gap tags

from openkiwi.

Comments (5)

captainvera avatar captainvera commented on September 5, 2024

Hey @zouharvi,
I've managed to reproduce your issue. Let me take a deeper look into this and I'll get back to you!

from openkiwi.

captainvera avatar captainvera commented on September 5, 2024

I've found the cause of this (bug). Unfortunately, in the current version of OpenKiwi, we do not support learning gaps + tags at the same time with the NuQE model.

The crash in particular happens because on the forward pass of NuQE you define the model_out as:

        outputs = OrderedDict()

        if self.config.predict_target:
            outputs[const.TARGET_TAGS] = h
        if self.config.predict_gaps:
            outputs[const.GAP_TAGS] = h
        if self.config.predict_source:
            outputs[const.SOURCE_TAGS] = h

        return outputs

We're adding the same hidden state to whatever we assume the output to be. Then, when calculating loss (where your error message comes from) we do the following:

        if self.config.predict_source:
            output_name = const.SOURCE_TAGS
        elif self.config.predict_gaps:
            output_name = const.GAP_TAGS
        else:
            output_name = const.TARGET_TAGS

Again, assuming we only have 1 type of output.
This means if you want to predict word + gap tags you'll have to train 2 separate models to do it.

However, the good news is that we are planning on making public several changes we've made to the OpenKiwi architecture. These changes include this change, amongst many other things, such as multi-threaded dataloading, multi-gpu, etc.
We will open an issue soon™️ to detail these changes and a tentative release date.

from openkiwi.

zouharvi avatar zouharvi commented on September 5, 2024

Hi @captainvera, thank you for your response.

Can you give us a hint at how to train the gap-only model? When we drop every (2n+1)th tag (leaving only tags for gaps), the dimension error is still there (this time the difference is 1024). (the config contains predict-gaps: true predict-target: false.

ValueError: Expected input batch_size (2112) to match target batch_size (1088).

In your 2019 paper (Table 1, WMT18 benchmarks), did you also have two separate models for MT and gaps? We are aiming at replicating your results, including training the models.

from openkiwi.

captainvera avatar captainvera commented on September 5, 2024

Kiwi automatically knows which tags are gap tags. just settings predict-target: false and wmt18-format: true should be enough!

If you are talking about the Openkiwi paper, then yes. On our submission to the QE-shared task we were already using a slightly different version of Kiwi where we implemented target+gaps training.

from openkiwi.

zouharvi avatar zouharvi commented on September 5, 2024

Thank you, predicting gaps now works as expected. We're looking forward to the upcoming changes.

from openkiwi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.