GithubHelp home page GithubHelp logo

Comments (12)

wzds2015 avatar wzds2015 commented on July 26, 2024

see same error when I trying to run the model, exactly same number. Is this related to TF version?

from hierarchical-attention-networks.

iloveddobboki avatar iloveddobboki commented on July 26, 2024

i just downgraded my tensorflow version to 1.1
i found that the implementation of bidirectional_dynamic_rnn has been changed in 1.2

from hierarchical-attention-networks.

ematvey avatar ematvey commented on July 26, 2024

Ok, I'll fix it next week (I stopped using tensorflow around 1.0).

from hierarchical-attention-networks.

hendriksc avatar hendriksc commented on July 26, 2024

I still encounter the same error on tensorflow 1.4.1. Is there a quick way to fix this?

from hierarchical-attention-networks.

ematvey avatar ematvey commented on July 26, 2024

Didn’t have any time to look at it, sorry! Would be happy to merge your PR, I suspect its something trivial.

from hierarchical-attention-networks.

longvtran avatar longvtran commented on July 26, 2024

Has there been a successful fix on this? I am also running into the same issue on tensorflow 1.5.0

from hierarchical-attention-networks.

HearyShen avatar HearyShen commented on July 26, 2024
ValueError: Trying to share variable tcm/word/fw/multi_rnn_cell/cell_0/bn_lstm/W_xh, but specified shape (100, 320) and found shape (200, 320).

I encountered this ValueError too.

Environment:

  • tensorflow(1.6.0),
  • Python 3.6.4.

from hierarchical-attention-networks.

acadTags avatar acadTags commented on July 26, 2024

exactly same error here
Environment:
tensorflow(1.6.0),
Python 3.6.2.

from hierarchical-attention-networks.

ghazi-f avatar ghazi-f commented on July 26, 2024

Based on my understanding of this issue in stackoverflow, I modified the code so that the the 'cell's passed as argument to HANClassifierModel in sentence_cell and word_cell become functions returning a cell. Then I called them when instanciating the bidirectional_rnn for sentence level and word level. In fact, each of the 2 cells necessary for the 2 bidirectional RNNs has to be instanciated as a different cell.
So the changes are :

# Defined cell entries as functions
def cell_maker():
    cell = BNLSTMCell(80, is_training) # h-h batchnorm LSTMCell
    # cell = GRUCell(30)
    return MultiRNNCell([cell]*5)`

  model = HANClassifierModel(
      vocab_size=vocab_size,
      embedding_size=200,
      classes=classes,
      word_cell=cell_maker,   # put the function as a cell entry ( without calling it)
      sentence_cell=cell_maker,   # put the function as a cell entry ( without calling it)
      word_output_size=100,
      sentence_output_size=100,
      device=args.device,
      learning_rate=args.lr,
      max_grad_norm=args.max_grad_norm,
      dropout_keep_proba=0.5,
      is_training=is_training,)

then

word_encoder_output, _ = bidirectional_rnn(
          self.word_cell(), self.word_cell(),  # called the function twice here 
          word_level_inputs, word_level_lengths,
          scope=scope)

and

sentence_encoder_output, _ = bidirectional_rnn(
          self.sentence_cell(), self.sentence_cell(),   # called the function twice here 
 sentence_inputs, self.sentence_lengths, scope=scope)

It runs now for me, but I can't confirm the performances, since I don't have GPU to make a complete train/test process. Can anyone try it ?

from hierarchical-attention-networks.

dugarsumit avatar dugarsumit commented on July 26, 2024

@Sora77 - How much time did it took you to complete 1 epoch of training on CPU?
@Others - How much time did it took you guys to train on CPU/GPU and what hardware resources were you using?
For me currently it takes around 20 sec per iteration on GeForce GTX 970 4GB. Which I feel is pretty slow. Just wanted to have some idea what kind of speed gain i can expect with a better hardware.

from hierarchical-attention-networks.

ghazi-f avatar ghazi-f commented on July 26, 2024

I don't have the logs anymore, but I remember it was more or less the same speed that you have. Some implementations do not properly leverage GPUs power.
I advice you to replace you Tensorflow-gpu library with Tensorflow and see for yourself. If it's the same try to look for the bottleneck in the code.

from hierarchical-attention-networks.

dugarsumit avatar dugarsumit commented on July 26, 2024

Thanks @Sora77 for the quick response and your suggestions :)

from hierarchical-attention-networks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.