Comments (6)
@gabrer Hi, do you have re-implemented on dataset of original paper?
In my implementation, I can't get the same performance according to the paper.
from hierarchical-attention-networks.
@superzhangxing Hi, in my implementation, I can only get 63.7% accuracy rate on the dev dataset of yelp2013 under the same configuration according to the paper. The accuracy rate on the test dataset is supposed to be a bit lower, which is different from the performance in the paper. Is there some idea? Can those tricks reduce the gap?
from hierarchical-attention-networks.
@MH23333 Hi, do you use pre-trained word embeddings or random initialized ones? I train them with word2vec method on train and dev dataset. I believe it will improve the accurate. I have the same configuration according to the paper and get around 67% accurate on dev and test dataset. Such as optimizer with SGD+Momentum, and momentum parameter with 0.9. The only trick I think is aligning the sentence length in each batch to accelerate the training speed. and it also has been mentioned in the paper.
from hierarchical-attention-networks.
@superzhangxing Appreciate for your quickly reply!I use word embeddings in the way same as you, and set all hyper parameters mentioned in the paper. Maybe other unmentioned hyper parameters have an important influence on the results. How to set the following two parameters may be important: sentence length(how many words in a sentnece) and document length(how many sentences in a document).
I will do more experiments. Many thanks!
from hierarchical-attention-networks.
@MH23333 The max sentence length and max document length are both set 40. Please note that I use dynamic rnn , so I don't use fixed sentence length or fixed document length. I'm not sure whether it makes the influence.
from hierarchical-attention-networks.
@superzhangxing Thanks a lot! I also use dynamic rnn and masked attention. More contrast tests will be performed. Hope I can get better results.
from hierarchical-attention-networks.
Related Issues (20)
- some error in yelp_prepare.py HOT 4
- ValueError in running worker.py HOT 12
- How to make `TensorBoard Projector` work.
- Why use orthogonal_initializer ?
- Error While Running yelp_prepare.py HOT 3
- Is the embedding initialized with a pre-trained one? HOT 2
- GRU VS LSTM HOT 1
- Are uw and us global weights? just to conform. HOT 1
- Mask for attention weight
- Getting same sentence level outputs for very different documents. Can someone please help.
- Embeddings for special tokens/padding?
- dev accuracy: nan???
- en-core-web-sm needs to be installed beforehand
- Same cell for word and sentence level HOT 3
- Won't the code leads to different input shape for different batch?
- Visualize word and sentence attention weight as color coded in the paper HOT 1
- Performance on Yelp 15
- Implementation using tf.contrib.seq2seq. HOT 7
- Attention layer output HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hierarchical-attention-networks.