jrc1995 / abstractive-summarization Goto Github PK

View Code? Open in Web Editor NEW

167.0 10.0 59.0 338 KB

Implementation of abstractive summarization using LSTM in the encoder-decoder architecture with local attention.

License: MIT License

Jupyter Notebook 100.00%

abstractive-text-summarization nlp deep-learning encoder-decoder attention-mechanism local-attention tensorflow lstm

abstractive-summarization's People

Contributors

Stargazers

Watchers

abstractive-summarization's Issues

AttributeError: 'Operation' object has no attribute 'mark_used' in python 3.5

when i run this line of code, output = model(tf_text,tf_seq_len,tf_output_len) in Summarisation_model.ipynb

Pretrained model

Actually, I was studying the implementation but could not get the pretrained model.I need to get the Seq2seq_summarization.cpkt file. I downloaded the zip file from the repositories but the pretrained model was not in. Please I need to have it.

UnicodeDecodeError: 'gbk' codec can't decode byte 0x93 in position 3136: illegal multibyte sequence

It has always showed the error.Could you help me solve it?
File "summarization_model.py", line 24, in
vocab,embd = loadGloVe(filename)
File "summarization_model.py", line 16, in loadGloVe
for line in file.readlines():
UnicodeDecodeError: 'gbk' codec can't decode byte 0x93 in position 3136: illegal multibyte sequence

How to get context?

plz help me to understand how to select the next word to be predicted and context for it.
thank you

Word2vec function and glove

the word2vec function is used to find out vector representation of a word using glove right? so what happens if the particular word is not found in glove? so is the nearest neighbor used to find the nearest vector? Is glove only used during preprocessing? during training are we substituting the actual word with closest match?
Thank you

Don't know where to get these files

Please tell me from where I can get these files:

vec_summaries
vec_texts
vocab_limit
embd_limit

Thanks! I'm really eager to test this fully...

I want to add query with this model and develop a query focused abstractive model. Any suggestion about how can I do this?

vec_summaries_reduced and vec_texts_reduced returning blank list []

when i am printing the imported summaries from the pickled file, its alright. It's fully printed in its encoded form. Vec_texts also prints fine. But vec_summaries_reduced and vec_texts_reduced returns blank list.
therefore the train_len is printed as 0.
also Percentage of the dataset with text length less that window size:
99.848
isn't that bad since 99% is less than window size so it is getting reduced?. when d=1 window size 3, then it is 44%, yet no data outputted from vec_summaries reduced ...please tell me what to do?
p.s-> don't know if it has any significance but I had not properly implemented the clean function from your code(it gave some errors). So I only just lowercased everything, nothing else.

update: when i set maxlen_summary as 80 instead of 7, trainlen = 14400. But is setting it as 80 wrong?

Testing

in your code there is no testing yet, but I see the testing data already exists. how to implement it? is it the same as validation
Can you help me for the steps i need to do?

ZeroDivisionError

Trying your method for a different dataset and I am getting a ZeroDivisionError for the Training and Validation Section. I assume that something is not loading properly because there should be no zero values.
Here is the code:

`import pickle
import random

with tf.Session() as sess: # Start Tensorflow Session
display_step = 100
patience = 5

 load = input("\nLoad checkpoint? y/n: ")
 print("")
 saver = tf.train.Saver()

 if load.lower() == 'y':

     print('Loading pre-trained weights for the model...')

     saver.restore(sess, r'C:\Users\james\Desktop\Title Generation - SENG 6245\Dataset250K.csv')
     sess.run(tf.global_variables())
     sess.run(tf.tables_initializer())

     with open(r'C:\Users\james\Desktop\Title Generation - SENG 6245\Dataset250K.csv', 'rb') as fp:
         train_data = pickle.load(fp)

     covered_epochs = train_data['covered_epochs']
     best_loss = train_data['best_loss']
     impatience = 0
    
     print('\nRESTORATION COMPLETE\n')

 else:
     best_loss = 2**30
     impatience = 0
     covered_epochs = 0

     init = tf.global_variables_initializer()
     sess.run(init)
     sess.run(tf.tables_initializer())

 epoch=0
 while (epoch+covered_epochs)<epochs:
    
     print("\n\nSTARTING TRAINING\n\n")
    
     batches_indices = [i for i in range(0, len(train_batches_text))]
     random.shuffle(batches_indices)

     total_train_acc = 0
     total_train_loss = 0

     for i in range(0, len(train_batches_text)):
        
         j = int(batches_indices[i])

         cost,prediction,\
             acc, _ = sess.run([cross_entropy,
                                outputs,
                                accuracy,
                                train_op],
                               feed_dict={tf_text: train_batches_text[j],
                                          tf_embd: embd,
                                          tf_summary: train_batches_summary[j],
                                          tf_true_summary_len: train_batches_true_summary_len[j],
                                          tf_train: True})
        
         total_train_acc += acc
         total_train_loss += cost

         if i % display_step == 0:
             print("Iter "+str(i)+", Cost= " +
                   "{:.3f}".format(cost)+", Acc = " +
                   "{:.2f}%".format(acc*100))
        
         if i % 500 == 0:
            
             idx = random.randint(0,len(train_batches_text[j])-1)
            
            
            
             text = " ".join([idx2vocab.get(vec,"<UNK>") for vec in train_batches_text[j][idx]])
             predicted_summary = [idx2vocab.get(vec,"<UNK>") for vec in prediction[idx]]
             actual_summary = [idx2vocab.get(vec,"<UNK>") for vec in train_batches_summary[j][idx]]
             
             print("\nSample Text\n")
             print(text)
             print("\nSample Predicted Summary\n")
             for word in predicted_summary:
                 if word == '<EOS>':
                     break
                 else:
                     print(word,end=" ")
             print("\n\nSample Actual Summary\n")
             for word in actual_summary:
                 if word == '<EOS>':
                     break
                 else:
                     print(word,end=" ")
             print("\n\n")
            
     print("\n\nSTARTING VALIDATION\n\n")
            
     total_val_loss=0
     total_val_acc=0
            
     for i in range(0, len(val_batches_text)):
        
         if i%100==0:
             print("Validating data # {}".format(i))

         cost, prediction,\
             acc = sess.run([cross_entropy,
                             outputs,
                             accuracy],
                               feed_dict={tf_text: val_batches_text[i],
                                          tf_embd: embd,
                                          tf_summary: val_batches_summary[i],
                                          tf_true_summary_len: val_batches_true_summary_len[i],
                                          tf_train: False})
         
         total_val_loss += cost
         total_val_acc += acc
    
    #Issue Starts Here  
     try:
         avg_val_loss = total_val_loss/len(val_batches_text) 
     except ZeroDivisionError: 
         avg_val_loss = 0
    
     print("\n\nEpoch: {}\n\n".format(epoch+covered_epochs))
     print("Average Training Loss: {:.3f}".format(total_train_loss/len(train_batches_text)))
     print("Average Training Accuracy: {:.2f}".format(100*total_train_acc/len(train_batches_text)))
     print("Average Validation Loss: {:.3f}".format(avg_val_loss))
     print("Average Validation Accuracy: {:.2f}".format(100*total_val_acc/len(val_batches_text)))
           
     if (avg_val_loss < best_loss):
         best_loss = avg_val_loss
         save_data={'best_loss':best_loss,'covered_epochs':covered_epochs+epoch+1}
         impatience=0
         with open('Model_Backup/Seq2seq_summarization.pkl', 'wb') as fp:
             pickle.dump(save_data, fp)
         saver.save(sess, 'Model_Backup/Seq2seq_summarization.ckpt')
         print("\nModel saved\n")
          
     else:
         impatience+=1
          
     if impatience > patience:
           break
          
          
     epoch+=1`

I can get rid of the error with exception handling but I was wondering if you had and idea of why it's not working in the first place.

Is it a python 3.x code , because few places you are using print function as statement, like print "hello" ?

where does the abstraction come from?

Hi I looked at the amazon fine food reviews data. It doesn't seem the review articles are accompanied by summaries. Are they?

Is the abstractive summary part based on any paper, such as Abigail see's pointer-generator? I only see references to lstm papers.

There is no mention of how you dealt with OOV words. Can you explain how you handled that?

ValueError: The passed save_path is not a valid checkpoint:

I'm trying to run this with Google Colab (GPU setting), with Tensorflow 2.0
It shows following error
saver.restore(sess, "/content/drive/My Drive/Colab Notebooks/Data/release/Model_Backup/Seq2seq_summarization.ckpt")

Following is the Code:
`
with tf.Session() as sess: # Start Tensorflow Session
display_step = 100
patience = 5

load = input("\nLoad checkpoint? y/n: ")
print("")
saver = tf.train.Saver(tf.global_variables(), )

if load.lower() == 'y':

    print('Loading pre-trained weights for the model...')
    saver.restore(sess, "/content/drive/My Drive/Colab Notebooks/Data/release/Model_Backup/Seq2seq_summarization.ckpt")
    sess.run(tf.global_variables())
    sess.run(tf.tables_initializer())

ValueError: setting an array element with a sequence

I am trying to run the code on the other dataset following the instructions in the readme, but I get 'ValueError: setting an array element with a sequence' when I try to train.

Do you have any idea where the problem could be?

should saver.save used before incrementing the next epoch value?

should saver.save used before incrementing the next epoch value? ie should it be saved just before step=step+1 or just after that?

Using the model, Complete Training and Optimize Hyperparameters

First, I want to thank you for your work.
Reading the future work to be done I got a lil bit confused: Complete Training and Optimize Hyperparameters isn't training already done? am seeing that U have already used the validation dataset but as a training dataset. Can U tell a lil bit more about the last part of the code "Training and Validation".

I want to try the model on a new article, however am new to tensorflow. I didn't get how to use the model generated on a new input. I used a usual command line model.predict (the one used for sklearn for example) but didn't get a result. Can U lead me to the right command to use to model files generated?

embedding = np.asarray(embd) return array(a, dtype, copy=False, order=order) MemoryError

I'm using python 3.4 and i get a memory error.when i include a parameter as dtype='object' I'm not able to get the word-tokenize correct.
I get this error:for match in self._lang_vars.period_context_re().finditer(text):
TypeError: expected string or bytes-like object

jrc1995 / abstractive-summarization Goto Github PK

abstractive-summarization's People

Contributors

Stargazers

Watchers

Forkers

abstractive-summarization's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs