adamklec / copynet Goto Github PK
View Code? Open in Web Editor NEWAn implementation of CopyNet
Home Page: https://arxiv.org/abs/1603.06393
An implementation of CopyNet
Home Page: https://arxiv.org/abs/1603.06393
Trouble ask next, "pt" file is what ah, where to download?Thank you very much!
Thanks for sharing the code. when I try to run this code, I create the datasets follow your description, Each file should have 2 lines of text. The first is the input sequence, the second is the target output sequence. But the error notices that lacking the file of cleaned_first_names.txt , so Could you please give us the data files? so that we can rerun this code. thanks again!
excuse me,after i run your code in my dataset,i found the va_loss is going up ,but train loss is going down.so
So,Have you ever encountered this situation when working with your own data set.
Thanks for sharing
Code starts here
transformed_hidden2 = self.copy_W(output).view(batch_size, self.hidden_size, 1)
copy_score_seq = torch.bmm(encoder_outputs, transformed_hidden2) # this is linear. add activation function before multiplying.
copy_scores = torch.bmm(torch.transpose(copy_score_seq, 1, 2), one_hot_input_seq).squeeze(1) # [b, vocab_size + seq_length]
missing_token_mask = (one_hot_input_seq.sum(dim=1) == 0) # tokens not present in the input sequence
missing_token_mask[:, 0] = 1 # tokens are not part of any sequence
copy_scores = copy_scores.masked_fill(missing_token_mask, -1000000.0)
gen_scores = self.out(output.squeeze(1)) # [b, vocab_size]
gen_scores[:, 0] = -1000000.0 # penalize tokens in generate mode too`
I have some issues with your above computation of copy_scores and gen_scores. Please let me know if I am wrong anywhere.
1.) In the computation of copy_scores, it is mentioned in the paper to multiply encoder_outputs with a weight matrix and apply activation function and then, multiply with the decoder RNN's hidden state. But your code seems to be doing totally different i.e. multiplying weight matrix with output of decoder RNN and multiplying the result with encoder_outputs. There is no non-linearity here.
2.) In the gen_scores computation, your code multiplies the output to a weight matrix where as in the paper, it is mentioned to compute the way it's done in Attention RNN encoder-decoder but between the one-hot encoding of word and the decoder RNN's hidden state. This is totally different from your implementation.
Can you please let me know if I misunderstood anything?
Thanks in advance!
Hi
The code is not working at all, I get several errors in evalute, could you put a data please, make a minimal example of working code?
thanks
Hi
I am training this model, I see blue score goes up and down all the time, val losses as well, does this code work at all? thanks
Hi, thanks for your code, have your ever test the model on any dataset and achieve the same performance ?
Emmm, thank you.
i wonder if the task of implement is for text summarization? could you share a toy_dataset so that i can make my own dataset? just an data example?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.