Comments (7)
Set your wamup steps to 10 percent of the total number of iterations required . In my case 15,000 helped. But please check .
Also please make sure you are sending the delimiter ie [SEP] as an indicator to stop decoding ie
labels_tgt = input_ids_tgt[1:]
input_ids_tgt = input_ids_tgt[:-1]
input_mask_src = [1] * len(input_ids_src)
input_mask_tgt = [1] * len(input_ids_tgt) while creating tf record .
from abstractive-summarization-with-transfer-learning.
I'm running your code on the CNN/Dailymail dataset.
However, training never end, displaying :
Batch #X
with X growing more and more. I waited a long time, then kill the process.
But now, when I run the inference code, produced summary is very bad. Example :
the two - year - year - year - old cate - old cat was found in the animal .
What did I do wrong ? Anyone in the same situation who succeed to fix the code ? (@Vibha111094)
I run the inference code ,but i don't know how to produce the summary.
should i post the original story through the postman,so it will give back a summary???
from abstractive-summarization-with-transfer-learning.
Set your wamup steps to 10 percent of the total number of iterations required . In my case 15,000 helped. But please check .
Where exactly can I set that?
from abstractive-summarization-with-transfer-learning.
In config.py you would have
lr = {
'learning_rate_schedule': 'constant.linear_warmup.rsqrt_decay.rsqrt_depth',
'lr_constant': 2 * (hidden_dim ** -0.5),
'static_lr': 1e-3,
'warmup_steps': 10000,
} .
You could increase to around 15000-20000.
from abstractive-summarization-with-transfer-learning.
When I put low numbers for steps =10 , warm up steps = 10 , max eval=10 iteration is still going 150+ for epoch 0. Could you help clarifying how those numbers are interlinked.
from abstractive-summarization-with-transfer-learning.
Set your wamup steps to 10 percent of the total number of iterations required . In my case 15,000 helped. But please check .
Also please make sure you are sending the delimiter ie [SEP] as an indicator to stop decoding ie
labels_tgt = input_ids_tgt[1:]
input_ids_tgt = input_ids_tgt[:-1]
input_mask_src = [1] * len(input_ids_src)
input_mask_tgt = [1] * len(input_ids_tgt) while creating tf record .
hello, I adopt the default setting and obtain ROUGE-1/2/L: 39.29/17.30/27.10. In fact the ROUGE-L result is terrible. I trained on 1 GPU for 3 days, total 17w steps with batch size = 32.
Could you provide your results on CNN/Dailymail dataset, or do you know what is wrong?
Many thanks!@Vibha111094
from abstractive-summarization-with-transfer-learning.
I am following the default settings. But after the second epoch, it's taking too long. Does anyone else happen to face the same problem?
from abstractive-summarization-with-transfer-learning.
Related Issues (20)
- ValueError during the init of pretrained BERT HOT 4
- How can I get a abstract quickly?
- Taking way too long for Training HOT 2
- Is there an error inside the _eval_epoch function? HOT 6
- Facing memory exhausted while running inference HOT 12
- The generated summary has always been one, without any change? HOT 1
- ImportError: cannot import name 'gfile' from 'tensorflow' HOT 1
- Can you make a demo data of this file ?
- The Result on CNN and Daily Mail HOT 1
- AssertionError: model name:bert/encoder/layer_0/ffn/intermediate/bias not exists! HOT 1
- NameError: name 'bert_pretrain_dir' is not defined
- batch size problem HOT 2
- Getting error module 'texar_repo.examples.bert.utils.model_utils' has no attribute 'transform_bert_to_texar_config'
- Requirements file missing HOT 1
- Hi, Can i use your code for Chinese task? HOT 1
- Can't load save_path when it is None.
- ValueError: Dimensions must be equal, but are 768 and 512 for 'bert/transformer_encoder_1/layer_0/add' HOT 2
- got an unexpected keyword argument 'embedding'
- Setup error
- alueError: Unknown hyperparameter: position_embedder_type. Only hyperparameters named 'kwargs' hyperparameters can contain new entries undefined in default hyperparameters.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from abstractive-summarization-with-transfer-learning.