santhoshkolloju / abstractive-summarization-with-transfer-learning Goto Github PK
View Code? Open in Web Editor NEWAbstractive summarisation using Bert as encoder and Transformer Decoder
Abstractive summarisation using Bert as encoder and Transformer Decoder
Hello i got this error how can i fix it ?
Traceback (most recent call last):
File "preprocess.py", line 5, in
from config import *
File "E:\project\Abstractive-Summarization-With-Transfer-Learning\config.py", line 1, in
import texar as tx
File "texar_repo\texar_init_.py", line 24, in
from texar.module_base import *
File "texar_repo\texar\module_base.py", line 26, in
from texar.utils.exceptions import TexarError
File "texar_repo\texar\utils_init_.py", line 31, in
from texar.utils.utils_io import *
File "texar_repo\texar\utils\utils_io.py", line 32, in
from tensorflow import gfile
ImportError: cannot import name 'gfile' from 'tensorflow' (C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow_init_.py)
File "texar_repo\texar\core\layers.py", line 628, in
class _ReducePooling1D(tf.layers.Layer):
AttributeError: module 'tensorflow.python.layers.layers' has no attribute 'Layer'
what should i do ?
thanks
Is there an error inside the _eval_epoch function?
After debugging, I found that this function would enter an endless loop, and no operation could overflow it, and [batch] acquisition of batch data seems to never be exhausted. Is there any mistake in my operation?
Looking forward to your reply!
The network for summarization is not optimised. Therefore the loss is too high and does not reduce much.
Hi. While executing the file model.py I am getting the following error on line 109.
AssertionError: model name:bert/encoder/layer_0/ffn/intermediate/bias not exists!
I am stuck here. What should I do to remove this error?
Plus I also add reuse=tf.AUTO_REUSE in tf.variablescope at line 75 to remove the following error:
ValueError: Variable bert/word_embeddings/w already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
I am telling about this in case it has anything to do with assertionError.
Please help me solve this issue. I am stuck here.
Can you add a requirement.txt file of this project as I am getting different issues related to in compatible version of different modules
I've partially trained the model, but when I went for testing the model and ran Inference.py, with static story and summaries in the script, it gave me the insufficient memory error from tensorflow. tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[512,10,50,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Getting this error. I am using tensorflow 1.10.0 and texar 0.2.1
Please help.
I am trying to train with 10k documents with additional 1k document for eval cycle.
Even for these small number of documents, it is projecting around 4 days of training time on Tesla M60 GPU.
I have changed config to have 10 docs per step with max steps to be 10000 for 10 epochs. It takes around 34 seconds per step, which gives us around 4 days of training time.
Am I doing something wrong?
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: The specified module could not be found.
Happens when i try running preprocess.py file.
In model.py line 114
decoder = tx.tf.modules.TransformerDecoder(embedding=tgt_embedding,
hparams=dcoder_config)
I am getting a type error that there is an unexpected keyword embedding passed into the TransformerDecoder. How did people resolve this? I see that the Transformer Decoder takes in (vocab_size=None, output_layer=None, hparams=None). So I'm not sure what the embedding refers to here.
Any guidance would be appreciated.
How did you get the data txt files ?
How did you process it ? Did you tokenize it ?
Hello, Thanks for providing the Transformer-based s2s models for abstractive text summarization, it helps me a lot.
I run it on CNN and Daily Mail dataset and obtain the results as:
ROUGE-1/2/L: 39.29/17.30/27.10
I adopt the default setting but find that the results are far from those reported in the previous study. For example (ROUGE-1/2/L)):
In "Text Summarization with Pretrained Encoders": TransformerABS - 40.21; 17.76; 37.09
In fact, the ROUGE-L result is terrible compared with others, therefore I doubt I make some mistakes during training. I trained on 1 GPU for 3 days, total 17w steps with batch size = 32.
Does anyone obtain the result on CNN and Daily Mail dataset, or know what is wrong during training?
Many thanks!
Thank you. I used your code to migrate to the Chinese dataset, but there was a problem in the prediction phase. The generated summary has always been one, without any change? How can I solve this problem?
央 行 : 加 快 推 进 利 率 市 场 化 建 立 存 款 保 险 制 度(Reference Abstract)
济 南 公 租 房 首 日 免 费 体 验(Generating Abstract)
天 津 自 贸 区 等 待 最 后 批 复 将 成 京 津 冀 融 合 实 验 区(Reference Abstract)
济 南 公 租 房 首 日 免 费 体 验(Generating Abstract)
韩 媒 : 油 价 每 下 跌 10 % 中 国 gdp 增 长 0 . 15 %(Reference Abstract)
济 南 公 租 房 首 日 免 费 体 验(Generating Abstract)
I observed that with the cnn mail data set ,the length of many of the sentences is beyond 512 words
So although I have trained it over 200,000 steps the bleu score seems to be arund 0.4 .
Any Ideas on how to overcome this problem ?
I'm running your code on the CNN/Dailymail dataset.
However, training never end, displaying :
Batch #X
with X growing more and more. I waited a long time, then kill the process.
But now, when I run the inference code, produced summary is very bad. Example :
the two - year - year - year - old cate - old cat was found in the animal .
What did I do wrong ? Anyone in the same situation who succeed to fix the code ? (@Vibha111094)
The earlier readme had a good code as to how to see the verbal output .
Would be really helpful if you can include it .
Getting a 500 error when using Postman on /results.
/preprocess.py", line 170, in convert_single_example
tokens_a = tokenizer.tokenize(example.src_txt)
AttributeError: 'dict' object has no attribute 'src_txt'
Is this because I'm using python3?
A little help here please.
what specific value should be given to the test_batch_size? could anyone suggest?
@santhoshkolloju Hi, i'm using your code to train on my own data, but i find that the bleu score in your code is multiplied by 100, and I am wondering why. Could you give me some clue on that problem? Thanks.
ValueError: Unknown hyperparameter: position_embedder_type. Only hyperparameters named 'kwargs' hyperparameters can contain new entries undefined in default hyperparameters.
I clone code, and run main.py without any code change,
file 'model.py', line 90,
encoder = tx.modules.TransformerEncoder(hparams=bert_config.encoder)
what's wrong? what should i do next?
thanks
-train_story.txt
-train_summ.txt
-eval_story.txt
-eval_summ.txt
I tried to run the code with a batch size of 8, however I got this error :
InvalidArgumentError (see above for traceback): Incompatible shapes at component 0: expected [1,512] but got [8,512].
I am getting module has no attribute error while running
bert_config = model_utils.transform_bert_to_texar_config(
os.path.join(bert_pretrain_dir, 'bert_config.json'))
AttributeError Traceback (most recent call last)
in ()
bert_config = model_utils.transform_bert_to_texar_config(
os.path.join(bert_pretrain_dir, 'bert_config.json'))
AttributeError: module 'texar_repo.examples.bert.utils.model_utils' has no attribute 'transform_bert_to_texar_config'
Please help
File "main.py", line 146, in
step = _train_epoch(sess, epoch, step, smry_writer)
File "main.py", line 55, in _train_epoch
_eval_epoch(sess, epoch, mode='eval')
File "main.py", line 66, in _eval_epoch 'inferred_ids': inferred_ids, NameError: name 'inferred_ids' is not defined
I got this error when i run main.py
What additional changes are required in the code so that I can see the actual summary produced for a given paragraph.
`
# Creates segment embeddings for each type of tokens.
segment_embedder = tx.modules.WordEmbedder(
vocab_size=bert_config.type_vocab_size,
hparams=bert_config.segment_embed)
segment_embeds = segment_embedder(src_segment_ids)
input_embeds = word_embeds + segment_embeds`
As per BERT paper, the input embeddings are a sum of Embedding Lookup, Segment Embedding and position embedding. As we can see in 'input_embeds = word_embeds + segment_embeds', position embedding is missing.
I am training the model and it is taking longer than expected so I killed the process.
However, when I am running inference.py, I am getting an error
Traceback (most recent call last):
File "inference.py", line 92, in
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
File "C:\Users\AKHIL\Anaconda3\lib\site-packages\tensorflow_core\python\training\saver.py", line 1277, in restore
raise ValueError("Can't load save_path when it is None.")
ValueError: Can't load save_path when it is None.
can anyone please look into it?
Hello!
I tried your code in a google colab and i encountered a problem i wasn't able to solve.
During the inititalization of the Bert encoder in your ipynb:
https://github.com/santhoshkolloju/Abstractive-Summarization-With-Transfer-Learning/blob/master/BERT_SUMM.ipynb
in cell 15 there occurs the following error
ValueError Traceback (most recent call last)
in ()
35 init_checkpoint = os.path.join(bert_pretrain_dir, 'bert_model.ckpt')
36 #init_checkpoint = "gs://cloud-tpu-checkpoints/bert/uncased_L-12_H-768_A-12/bert_model.ckpt"
---> 37 model_utils.init_bert_checkpoint(init_checkpoint)
5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/checkpoint_utils.py in _init_from_checkpoint(ckpt_dir_or_file, assignment_map)
344 "Assignment map with scope only name {} should map to scope only "
345 "{}. Should be 'scope/': 'other_scope/'.".format(
--> 346 scopes, tensor_name_in_ckpt))
347 # If scope to scope mapping was provided, find all variables in the scope
348 # and create variable to variable mapping.
ValueError: Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.`
i checked the github repo of texar already and found the post:
asyml/texar#127
Basically the code for the encoder and decoder changed in the newer version of texar but i dont know how to adjust the code.
I observed that although I use
feed_dict = {
iterator.handle: iterator.get_handle(sess, 'eval'),
tx.global_mode(): tf.estimator.ModeKeys.EVAL,
}
in the _eval_epoch method ,I observed that it is using a few examples from train dataset as well.
Is this the desired behavior as we use FeedableDataIterator which is supposed to iterates through multiple datasets and switches between datasets.
If so could you please explain why such a behavior is necessary .
_#tx.utils.maybe_create_dir(model_dir)
#logging_file = os.path.join(model_dir, 'logging.txt')
model_dir = "gs://bert_summ/models/"uncased_L-12_H-768_A-12/bert_model.ckpt
logging_file= "logging.txt"
logger = utils.get_logger(logging_file)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
sess.run(tf.tables_initializer())
smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)
if run_mode == 'train_and_evaluate':
logger.info('Begin running with train_and_evaluate mode')
if tf.train.latest_checkpoint(model_dir) is not None:
logger.info('Restore latest checkpoint in %s' % model_dir)
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
iterator.initialize_dataset(sess)
step = 5000
for epoch in range(max_train_epoch):
iterator.restart_dataset(sess, 'train')
step = _train_epoch(sess, epoch, step, smry_writer)
elif run_mode == 'test':
logger.info('Begin running with test mode')
logger.info('Restore latest checkpoint in %s' % model_dir)
saver.restore(sess, tf.train.latest_checkpoint(model_dir))
_eval_epoch(sess, 0, mode='test')
else:
raise ValueError('Unknown mode: {}'.format(run_mode))_
PermissionDeniedError Traceback (most recent call last)
in
10 sess.run(tf.tables_initializer())
11
---> 12 smry_writer = tf.summary.FileWriter(model_dir, graph=sess.graph)
13
14 if run_mode == 'train_and_evaluate':
~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py in init(self, logdir, graph, max_queue, flush_secs, graph_def, filename_suffix)
350
351 event_writer = EventFileWriter(logdir, max_queue, flush_secs,
--> 352 filename_suffix)
353 super(FileWriter, self).init(event_writer, graph, graph_def)
354
~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/summary/writer/event_file_writer.py in init(self, logdir, max_queue, flush_secs, filename_suffix)
65 self._logdir = logdir
66 if not gfile.IsDirectory(self._logdir):
---> 67 gfile.MakeDirs(self._logdir)
68 self._event_queue = six.moves.queue.Queue(max_queue)
69 self._ev_writer = pywrap_tensorflow.EventsWriter(
~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py in recursive_create_dir(dirname)
372 """
373 with errors.raise_exception_on_not_ok_status() as status:
--> 374 pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(dirname), status)
375
376
~/anaconda3/envs/tf-1.8/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg)
517 None, None,
518 compat.as_text(c_api.TF_Message(self.status.status)),
--> 519 c_api.TF_GetCode(self.status.status))
520 # Delete the underlying status object from memory otherwise it stays alive
521 # as there is a reference to status from this from the traceback due to
PermissionDeniedError: Error executing an HTTP request (HTTP response code 401, error code 0, error message ''), response '{
"error": {
"errors": [
{
"domain": "global",
"reason": "required",
"message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/.",
"locationType": "header",
"location": "Authorization"
}
],
"code": 401,
"message": "Anonymous caller does not have storage.objects.get access to bert_summ/models/."
}
}
'
when reading metadata of gs://bert_summ/models/
How can I get a abstract quickly, now I need long time and many cpu source to count it, if have a fast solution ?
I really like what you've done here.
I have a BERT model fine-tuned for NER and would like to implement it using your architecture here.
My intention is to bypass the fine-tuning section where you use stories and directly use my fine-tuned model in it's place.
Do you have any tips?
Can you provide train.tf_record file?
NotFoundError: Error executing an HTTP request: HTTP response code 404 with body '{ "error": { "errors": [ { "domain": "global", "reason": "notFound", "message": "No such object: bert_summarization/train.tf_record" } ], "code": 404, "message": "No such object: bert_summarization/train.tf_record" } } ' when reading metadata of gs://bert_summarization/train.tf_record [[node IteratorGetNext_1 (defined at texar_repo/texar/data/data/data_iterators.py:401) ]]
Creating a requirements.txt
file might help users for dependencies :)
ValueError: Dimensions must be equal, but are 768 and 512 for 'bert/transformer_encoder_1/layer_0/add' (op: 'AddV2') with input shapes:
[?,?,768], [?,?,512].
Can you please help me resolve this dimension error. Thanks
Thanks for sharing such a helpful repo~
I want to double-check with the author about the initialization part.
According to my understanding, Encoder
is initialized with pre-trained BERT
and Decoder
is initialized from scratch.
@santhoshkolloju when you train this model, do you use bert embedding as abstract embedding, it mean article and abstract will go through bert when train this summary model?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.