GithubHelp home page GithubHelp logo

attention-ocr-toy-example's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

attention-ocr-toy-example's Issues

deocde训练阶段的输入

代码里训练阶段使用TrainingHelper,这样decode的输入就是target,但带来一个问题,就是测试的时候是没有target的,这样训练与测试的环境差的很大。实际我发现,训练阶段准确率很高,但测试的时候准确率很低。所以能不能在训练的时候用decode的输出当做输入?而不是用target当做输入

数据生成的疑问

在生成attention数据时,train_output和target_output不太理解为什么要这么定义,还有就是为什么一开始定义Y,YY都是-2,然后最终又都要+3,图片的标签不应该时图片上是什么数字,标签就是什么数字吗,Y,YY用意何在?,刚开始接触attention,不是很清楚。

about ctc & attention model

Hi, @ray075hl

Have you compared CTC and attention results using ctc&attention model? In the related study,CTC was found to play an auxiliary role.
but...In actual results, CTC results were better than attention results. Do you have an opinion on this?

ps.
related paper : JOINT CTC-ATTENTION BASED END-TO-END SPEECH RECOGNITION USING MULTI-TASK LEARNING

attention代码的疑问?

attention 机制应该需要使用encoder的hide states,但是我看代码里enc_state没有返回,decoder也没有用enc_state初始化?这里是不是有问题?

CTC_model feature_length

您好,请问下CTC_model里面的feature_length为什么是固定的29呢,我替换为实际的label sequence的长度反而会报错。还有就是是否CTC_model的project_out用np.argmax(project_out, axis = 1)就应该是预测的结果?我在训练的时候输出的loss降低的很快,但是按照上述方式解码出来看上去只有少数两三个字符顺序是正确的。
期待您解惑,非常感谢!

关于代码的一些疑问

tf.contrib.seq2seq.LuongAttention(num_units=cfg.RNN_UNITS, memory=memory)这里的memory是什么意思?我看代码是把encode的输出当作memory的,另外我把tf.contrib.seq2seq.BasicDecoder(
cell=attn_cell, helper=helper,
initial_state=attn_cell.zero_state(dtype=tf.float32, batch_size=cfg.BATCH_SIZE).clone(cell_state=enc_state[0]),
output_layer=output_layer)
里面的state从0改为encode的状态,效果好了很多,这是为什么?

i have a problem

hi
..
File "../model.py", line 228, in _att_decode
att_outputs, _. _ = tf.contrib.seq2seq.dynamic_decode( decoder=decoder, output_time_major=False, impute_finished=True, maximum_iterations=self.params.attention_iteration)
..
ValueError: Shape must be rank 1 but is rank 2 for '..../while/BasicDecoderStep/decoder/attention_wrapper/concat' (op: 'ConcatV2') with input shapes: [64], [64,256], [].

do you kow this problem?
thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.