GithubHelp home page GithubHelp logo

Comments (7)

Vibha111094 avatar Vibha111094 commented on May 27, 2024

Set your wamup steps to 10 percent of the total number of iterations required . In my case 15,000 helped. But please check .
Also please make sure you are sending the delimiter ie [SEP] as an indicator to stop decoding ie
labels_tgt = input_ids_tgt[1:]
input_ids_tgt = input_ids_tgt[:-1]
input_mask_src = [1] * len(input_ids_src)
input_mask_tgt = [1] * len(input_ids_tgt) while creating tf record .

from abstractive-summarization-with-transfer-learning.

ishurironaldinho avatar ishurironaldinho commented on May 27, 2024

I'm running your code on the CNN/Dailymail dataset.

However, training never end, displaying :

Batch #X

with X growing more and more. I waited a long time, then kill the process.

But now, when I run the inference code, produced summary is very bad. Example :

the two - year - year - year - old cate - old cat was found in the animal .

What did I do wrong ? Anyone in the same situation who succeed to fix the code ? (@Vibha111094)

I run the inference code ,but i don't know how to produce the summary.

should i post the original story through the postman,so it will give back a summary???

from abstractive-summarization-with-transfer-learning.

thatianafernandes avatar thatianafernandes commented on May 27, 2024

Set your wamup steps to 10 percent of the total number of iterations required . In my case 15,000 helped. But please check .

Where exactly can I set that?

from abstractive-summarization-with-transfer-learning.

Vibha111094 avatar Vibha111094 commented on May 27, 2024

In config.py you would have
lr = {
'learning_rate_schedule': 'constant.linear_warmup.rsqrt_decay.rsqrt_depth',
'lr_constant': 2 * (hidden_dim ** -0.5),
'static_lr': 1e-3,
'warmup_steps': 10000,
} .
You could increase to around 15000-20000.

from abstractive-summarization-with-transfer-learning.

mishrachinmaya689 avatar mishrachinmaya689 commented on May 27, 2024

When I put low numbers for steps =10 , warm up steps = 10 , max eval=10 iteration is still going 150+ for epoch 0. Could you help clarifying how those numbers are interlinked.

from abstractive-summarization-with-transfer-learning.

xieyxclack avatar xieyxclack commented on May 27, 2024

Set your wamup steps to 10 percent of the total number of iterations required . In my case 15,000 helped. But please check .
Also please make sure you are sending the delimiter ie [SEP] as an indicator to stop decoding ie
labels_tgt = input_ids_tgt[1:]
input_ids_tgt = input_ids_tgt[:-1]
input_mask_src = [1] * len(input_ids_src)
input_mask_tgt = [1] * len(input_ids_tgt) while creating tf record .

hello, I adopt the default setting and obtain ROUGE-1/2/L: 39.29/17.30/27.10. In fact the ROUGE-L result is terrible. I trained on 1 GPU for 3 days, total 17w steps with batch size = 32.
Could you provide your results on CNN/Dailymail dataset, or do you know what is wrong?
Many thanks!@Vibha111094

from abstractive-summarization-with-transfer-learning.

Shanzaay avatar Shanzaay commented on May 27, 2024

I am following the default settings. But after the second epoch, it's taking too long. Does anyone else happen to face the same problem?

from abstractive-summarization-with-transfer-learning.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.