Abstractive summarisation using Bert as encoder and Transformer Decoder

Jupyter Notebook 35.88% Python 62.32% Perl 0.69% Makefile 0.32% Batchfile 0.29% Shell 0.49%

bert summarization abstractive-summarization abstractive-text-summarization bert-model transformer transfer-learning nlp nlg

abstractive-summarization-with-transfer-learning's Issues

ImportError: cannot import name 'gfile' from 'tensorflow'

Hello i got this error how can i fix it ?

Traceback (most recent call last):
File "preprocess.py", line 5, in
from config import *
File "E:\project\Abstractive-Summarization-With-Transfer-Learning\config.py", line 1, in
import texar as tx
File "texar_repo\texar_init_.py", line 24, in
from texar.module_base import *
File "texar_repo\texar\module_base.py", line 26, in
from texar.utils.exceptions import TexarError
File "texar_repo\texar\utils_init_.py", line 31, in
from texar.utils.utils_io import *
File "texar_repo\texar\utils\utils_io.py", line 32, in
from tensorflow import gfile
ImportError: cannot import name 'gfile' from 'tensorflow' (C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow_init_.py)

AttributeError: module 'tensorflow.python.layers.layers' has no attribute 'Layer'

File "texar_repo\texar\core\layers.py", line 628, in
class _ReducePooling1D(tf.layers.Layer):
AttributeError: module 'tensorflow.python.layers.layers' has no attribute 'Layer'

what should i do ?
thanks

Hi, Can i use your code for Chinese task?

Is there an error inside the _eval_epoch function?

After debugging, I found that this function would enter an endless loop, and no operation could overflow it, and [batch] acquisition of batch data seems to never be exhausted. Is there any mistake in my operation?

Looking forward to your reply!

Network is not optimised.

The network for summarization is not optimised. Therefore the loss is too high and does not reduce much.

AssertionError: model name:bert/encoder/layer_0/ffn/intermediate/bias not exists!

Hi. While executing the file model.py I am getting the following error on line 109.

AssertionError: model name:bert/encoder/layer_0/ffn/intermediate/bias not exists!

I am stuck here. What should I do to remove this error?
Plus I also add reuse=tf.AUTO_REUSE in tf.variablescope at line 75 to remove the following error:
ValueError: Variable bert/word_embeddings/w already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
I am telling about this in case it has anything to do with assertionError.
Please help me solve this issue. I am stuck here.

Requirements file missing

Can you add a requirement.txt file of this project as I am getting different issues related to in compatible version of different modules

Facing memory exhausted while running inference

I've partially trained the model, but when I went for testing the model and ran Inference.py, with static story and summaries in the script, it gave me the insufficient memory error from tensorflow. tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[512,10,50,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

alueError: Unknown hyperparameter: position_embedder_type. Only hyperparameters named 'kwargs' hyperparameters can contain new entries undefined in default hyperparameters.

Getting this error. I am using tensorflow 1.10.0 and texar 0.2.1
Please help.

How can I run inference.py locally and in a normal way rather than using a flask app, which I can hardly figure out how it works?

Taking way too long for Training

I am trying to train with 10k documents with additional 1k document for eval cycle.

Even for these small number of documents, it is projecting around 4 days of training time on Tesla M60 GPU.

I have changed config to have 10 docs per step with max steps to be 10000 for 10 epochs. It takes around 34 seconds per step, which gives us around 4 days of training time.

Am I doing something wrong?

Setup error

ImportError: DLL load failed while importing _pywrap_tensorflow_internal: The specified module could not be found.

Happens when i try running preprocess.py file.

got an unexpected keyword argument 'embedding'

In model.py line 114

decoder = tx.tf.modules.TransformerDecoder(embedding=tgt_embedding,
                             hparams=dcoder_config)

I am getting a type error that there is an unexpected keyword embedding passed into the TransformerDecoder. How did people resolve this? I see that the Transformer Decoder takes in (vocab_size=None, output_layer=None, hparams=None). So I'm not sure what the embedding refers to here.

Any guidance would be appreciated.

More details on how to get the data

How did you get the data txt files ?

How did you process it ? Did you tokenize it ?

Thanks, I've made the suggested changes. The process is no longer hanging, but I see the following error. (I haven't made any other code changes)

i want to change batch_size to 8,could do it?

The Result on CNN and Daily Mail

Hello, Thanks for providing the Transformer-based s2s models for abstractive text summarization, it helps me a lot.
I run it on CNN and Daily Mail dataset and obtain the results as:

1 ROUGE-1 Average_R: 0.40213 (95%-conf.int. 0.39962 - 0.40466)
1 ROUGE-1 Average_P: 0.40580 (95%-conf.int. 0.40310 - 0.40855)
1 ROUGE-1 Average_F: 0.39289 (95%-conf.int. 0.39072 - 0.39516)

1 ROUGE-2 Average_R: 0.17639 (95%-conf.int. 0.17417 - 0.17878)
1 ROUGE-2 Average_P: 0.17982 (95%-conf.int. 0.17756 - 0.18227)
1 ROUGE-2 Average_F: 0.17305 (95%-conf.int. 0.17094 - 0.17527)

1 ROUGE-L Average_R: 0.27810 (95%-conf.int. 0.27581 - 0.28035)
1 ROUGE-L Average_P: 0.27940 (95%-conf.int. 0.27701 - 0.28185)
1 ROUGE-L Average_F: 0.27099 (95%-conf.int. 0.26895 - 0.27300)

ROUGE-1/2/L: 39.29/17.30/27.10

I adopt the default setting but find that the results are far from those reported in the previous study. For example (ROUGE-1/2/L)):
In "Text Summarization with Pretrained Encoders": TransformerABS - 40.21; 17.76; 37.09

In fact, the ROUGE-L result is terrible compared with others, therefore I doubt I make some mistakes during training. I trained on 1 GPU for 3 days, total 17w steps with batch size = 32.

Does anyone obtain the result on CNN and Daily Mail dataset, or know what is wrong during training?
Many thanks!

The generated summary has always been one, without any change?

Thank you. I used your code to migrate to the Chinese dataset, but there was a problem in the prediction phase. The generated summary has always been one, without any change? How can I solve this problem?
央行：加快推进利率市场化建立存款保险制度（Reference Abstract）
济南公租房首日免费体验（Generating Abstract）

天津自贸区等待最后批复将成京津冀融合实验区（Reference Abstract）
济南公租房首日免费体验（Generating Abstract）

韩媒：油价每下跌 10 % 中国 gdp 增长 0 . 15 %（Reference Abstract）
济南公租房首日免费体验（Generating Abstract）

Length of Input Sequence

I observed that with the cnn mail data set ,the length of many of the sentences is beyond 512 words
So although I have trained it over 200,000 steps the bleu score seems to be arund 0.4 .

Any Ideas on how to overcome this problem ?

Never ending training

I'm running your code on the CNN/Dailymail dataset.

However, training never end, displaying :

Batch #X

with X growing more and more. I waited a long time, then kill the process.

But now, when I run the inference code, produced summary is very bad. Example :

the two - year - year - year - old cate - old cat was found in the animal .

What did I do wrong ? Anyone in the same situation who succeed to fix the code ? (@Vibha111094)

Changes to the readme

The earlier readme had a good code as to how to see the verbal output .
Would be really helpful if you can include it .

File "Inference.py", line 40, in infer_single_example 'inferred_ids': inferred_ids, NameError: name 'inferred_ids' is not defined

AttributeError: 'dict' object has no attribute 'src_txt'

Getting a 500 error when using Postman on /results.

/preprocess.py", line 170, in convert_single_example
tokens_a = tokenizer.tokenize(example.src_txt)
AttributeError: 'dict' object has no attribute 'src_txt'

Is this because I'm using python3?

NameError: name 'bert_pretrain_dir' is not defined

A little help here please.

batch size problem

what specific value should be given to the test_batch_size? could anyone suggest?

The bleu score

@santhoshkolloju Hi, i'm using your code to train on my own data, but i find that the bleu score in your code is multiplied by 100, and I am wondering why. Could you give me some clue on that problem? Thanks.

ValueError: Unknown hyperparameter: position_embedder_type. Only hyperparameters named 'kwargs' hyperparameters can contain new entries undefined in default hyperparameters.

I clone code, and run main.py without any code change,
file 'model.py', line 90,
encoder = tx.modules.TransformerEncoder(hparams=bert_config.encoder)

what's wrong? what should i do next?
thanks

Can you make a demo data of this file ?

-train_story.txt
-train_summ.txt
-eval_story.txt
-eval_summ.txt

InvalidArgumentError when changing the batch size

I tried to run the code with a batch size of 8, however I got this error :

InvalidArgumentError (see above for traceback): Incompatible shapes at component 0: expected [1,512] but got [8,512].

Getting error module 'texar_repo.examples.bert.utils.model_utils' has no attribute 'transform_bert_to_texar_config'

I am getting module has no attribute error while running
bert_config = model_utils.transform_bert_to_texar_config(
os.path.join(bert_pretrain_dir, 'bert_config.json'))

AttributeError Traceback (most recent call last)
in ()
bert_config = model_utils.transform_bert_to_texar_config(
os.path.join(bert_pretrain_dir, 'bert_config.json'))

AttributeError: module 'texar_repo.examples.bert.utils.model_utils' has no attribute 'transform_bert_to_texar_config'

Please help

File "main.py", line 66, in _eval_epoch 'inferred_ids': inferred_ids, NameError: name 'inferred_ids' is not defined

File "main.py", line 146, in
step = _train_epoch(sess, epoch, step, smry_writer)
File "main.py", line 55, in _train_epoch
_eval_epoch(sess, epoch, mode='eval')
File "main.py", line 66, in _eval_epoch 'inferred_ids': inferred_ids, NameError: name 'inferred_ids' is not defined

I got this error when i run main.py

Does the code produce verbal summaries?

What additional changes are required in the code so that I can see the actual summary produced for a given paragraph.

Not added position embedding to BERT encoder Input

    # Creates segment embeddings for each type of tokens.
    segment_embedder = tx.modules.WordEmbedder(
        vocab_size=bert_config.type_vocab_size,
        hparams=bert_config.segment_embed)
    segment_embeds = segment_embedder(src_segment_ids)

    input_embeds = word_embeds + segment_embeds`

As per BERT paper, the input embeddings are a sum of Embedding Lookup, Segment Embedding and position embedding. As we can see in 'input_embeds = word_embeds + segment_embeds', position embedding is missing.

Can't load save_path when it is None.

I am training the model and it is taking longer than expected so I killed the process.
However, when I am running inference.py, I am getting an error
Traceback (most recent call last):
File "inference.py", line 92, in
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
File "C:\Users\AKHIL\Anaconda3\lib\site-packages\tensorflow_core\python\training\saver.py", line 1277, in restore
raise ValueError("Can't load save_path when it is None.")
ValueError: Can't load save_path when it is None.
can anyone please look into it?

ValueError during the init of pretrained BERT

Hello!
I tried your code in a google colab and i encountered a problem i wasn't able to solve.
During the inititalization of the Bert encoder in your ipynb:
https://github.com/santhoshkolloju/Abstractive-Summarization-With-Transfer-Learning/blob/master/BERT_SUMM.ipynb

in cell 15 there occurs the following error

`Intializing the Bert Encoder Graph
loading the bert pretrained weights

ValueError Traceback (most recent call last)
in ()
35 init_checkpoint = os.path.join(bert_pretrain_dir, 'bert_model.ckpt')
36 #init_checkpoint = "gs://cloud-tpu-checkpoints/bert/uncased_L-12_H-768_A-12/bert_model.ckpt"
---> 37 model_utils.init_bert_checkpoint(init_checkpoint)

5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/checkpoint_utils.py in _init_from_checkpoint(ckpt_dir_or_file, assignment_map)
344 "Assignment map with scope only name {} should map to scope only "
345 "{}. Should be 'scope/': 'other_scope/'.".format(
--> 346 scopes, tensor_name_in_ckpt))
347 # If scope to scope mapping was provided, find all variables in the scope
348 # and create variable to variable mapping.

ValueError: Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.`

i checked the github repo of texar already and found the post:
asyml/texar#127

Basically the code for the encoder and decoder changed in the newer version of texar but i dont know how to adjust the code.

Eval method seems to be using data from the train dataset

I observed that although I use
feed_dict = {
iterator.handle: iterator.get_handle(sess, 'eval'),
tx.global_mode(): tf.estimator.ModeKeys.EVAL,
}
in the _eval_epoch method ,I observed that it is using a few examples from train dataset as well.
Is this the desired behavior as we use FeedableDataIterator which is supposed to iterates through multiple datasets and switches between datasets.

If so could you please explain why such a behavior is necessary .

While running this block i.e. the last block

_#tx.utils.maybe_create_dir(model_dir)
#logging_file = os.path.join(model_dir, 'logging.txt')

model_dir = "gs://bert_summ/models/"uncased_L-12_H-768_A-12/bert_model.ckpt
logging_file= "logging.txt"
logger = utils.get_logger(logging_file)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
sess.run(tf.tables_initializer())

smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)

if run_mode == 'train_and_evaluate':
logger.info('Begin running with train_and_evaluate mode')

if tf.train.latest_checkpoint(model_dir) is not None:
    logger.info('Restore latest checkpoint in %s' % model_dir)
    saver.restore(sess, tf.train.latest_checkpoint(model_dir))

iterator.initialize_dataset(sess)

step = 5000
for epoch in range(max_train_epoch):
  iterator.restart_dataset(sess, 'train')
  step = _train_epoch(sess, epoch, step, smry_writer)

elif run_mode == 'test':
logger.info('Begin running with test mode')

logger.info('Restore latest checkpoint in %s' % model_dir)
saver.restore(sess, tf.train.latest_checkpoint(model_dir))

_eval_epoch(sess, 0, mode='test')

else:
raise ValueError('Unknown mode: {}'.format(run_mode))_

The error I am getting is:-

PermissionDeniedError Traceback (most recent call last)
in
10 sess.run(tf.tables_initializer())
11
---> 12 smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)
13
14 if run_mode == 'train_and_evaluate':

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py in init(self, logdir, graph, max_queue, flush_secs, graph_def, filename_suffix)
350
351 event_writer = EventFileWriter(logdir, max_queue, flush_secs,
--> 352 filename_suffix)
353 super(FileWriter, self).init(event_writer, graph, graph_def)
354

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/event_file_writer.py in init(self, logdir, max_queue, flush_secs, filename_suffix)
65 self._logdir = logdir
66 if not gfile.IsDirectory(self._logdir):
---> 67 gfile.MakeDirs(self._logdir)
68 self._event_queue = six.moves.queue.Queue(max_queue)
69 self._ev_writer = pywrap_tensorflow.EventsWriter(

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py in recursive_create_dir(dirname)
372 """
373 with errors.raise_exception_on_not_ok_status() as status:
--> 374 pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(dirname), status)
375
376

~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg)
517 None, None,
518 compat.as_text(c_api.TF_Message(self.status.status)),
--> 519 c_api.TF_GetCode(self.status.status))
520 # Delete the underlying status object from memory otherwise it stays alive
521 # as there is a reference to status from this from the traceback due to

PermissionDeniedError: Error executing an HTTP request (HTTP response code 401, error code 0, error message ''), response '{
"error": {
"errors": [
{
"domain": "global",
"reason": "required",
"message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/.",
"locationType": "header",
"location": "Authorization"
}
],
"code": 401,
"message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/."
}
}
'
when reading metadata of gs://bert_summ/models/

How long does this eval_epoch runs on a GTX 1080 with the full CNN Stories dataset?

tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./uncased_L-12_H-768_A-12/bert_model.ckpt

How can I get a abstract quickly?

How can I get a abstract quickly, now I need long time and many cpu source to count it, if have a fast solution ?

Can you provide some summary examples from text?

Implement NER fine-tuned BERT model

I really like what you've done here.

I have a BERT model fine-tuned for NER and would like to implement it using your architecture here.

My intention is to bypass the fine-tuning section where you use stories and directly use my fine-tuned model in it's place.

Do you have any tips?

train.tf_record not found

Can you provide train.tf_record file?
NotFoundError: Error executing an HTTP request: HTTP response code 404 with body '{ "error": { "errors": [ { "domain": "global", "reason": "notFound", "message": "No such object: bert_summarization/train.tf_record" } ], "code": 404, "message": "No such object: bert_summarization/train.tf_record" } } ' when reading metadata of gs://bert_summarization/train.tf_record [[node IteratorGetNext_1 (defined at texar_repo/texar/data/data/data_iterators.py:401) ]]

Requirements.txt

Creating a requirements.txt file might help users for dependencies :)

ValueError: Dimensions must be equal, but are 768 and 512 for 'bert/transformer_encoder_1/layer_0/add'

ValueError: Dimensions must be equal, but are 768 and 512 for 'bert/transformer_encoder_1/layer_0/add' (op: 'AddV2') with input shapes:
[?,?,768], [?,?,512].

Can you please help me resolve this dimension error. Thanks

Double check the initilization part

Thanks for sharing such a helpful repo~

I want to double-check with the author about the initialization part.

According to my understanding, Encoder is initialized with pre-trained BERT and Decoder is initialized from scratch.

decoder embedding

@santhoshkolloju when you train this model, do you use bert embedding as abstract embedding, it mean article and abstract will go through bert when train this summary model?

santhoshkolloju / abstractive-summarization-with-transfer-learning Goto Github PK

abstractive-summarization-with-transfer-learning's Issues

1 ROUGE-1 Average_R: 0.40213 (95%-conf.int. 0.39962 - 0.40466) 1 ROUGE-1 Average_P: 0.40580 (95%-conf.int. 0.40310 - 0.40855) 1 ROUGE-1 Average_F: 0.39289 (95%-conf.int. 0.39072 - 0.39516)

1 ROUGE-2 Average_R: 0.17639 (95%-conf.int. 0.17417 - 0.17878) 1 ROUGE-2 Average_P: 0.17982 (95%-conf.int. 0.17756 - 0.18227) 1 ROUGE-2 Average_F: 0.17305 (95%-conf.int. 0.17094 - 0.17527)

1 ROUGE-L Average_R: 0.27810 (95%-conf.int. 0.27581 - 0.28035) 1 ROUGE-L Average_P: 0.27940 (95%-conf.int. 0.27701 - 0.28185) 1 ROUGE-L Average_F: 0.27099 (95%-conf.int. 0.26895 - 0.27300)

`Intializing the Bert Encoder Graph loading the bert pretrained weights

While running this block i.e. the last block

The error I am getting is:-

Recommend Projects

Recommend Topics

Recommend Org

Jobs

1 ROUGE-1 Average_R: 0.40213 (95%-conf.int. 0.39962 - 0.40466)
1 ROUGE-1 Average_P: 0.40580 (95%-conf.int. 0.40310 - 0.40855)
1 ROUGE-1 Average_F: 0.39289 (95%-conf.int. 0.39072 - 0.39516)

1 ROUGE-2 Average_R: 0.17639 (95%-conf.int. 0.17417 - 0.17878)
1 ROUGE-2 Average_P: 0.17982 (95%-conf.int. 0.17756 - 0.18227)
1 ROUGE-2 Average_F: 0.17305 (95%-conf.int. 0.17094 - 0.17527)

1 ROUGE-L Average_R: 0.27810 (95%-conf.int. 0.27581 - 0.28035)
1 ROUGE-L Average_P: 0.27940 (95%-conf.int. 0.27701 - 0.28185)
1 ROUGE-L Average_F: 0.27099 (95%-conf.int. 0.26895 - 0.27300)

`Intializing the Bert Encoder Graph
loading the bert pretrained weights