GithubHelp home page GithubHelp logo

elmo-chinese's Introduction

ELMo-chinese

LICENSE

Deep contextualized word representations for Chinese.

本仓库只是输出上下文无关的 word embedding。

依赖

  • python3
  • tensorflow >= 1.10
  • jieba

使用方法

  1. 准备数据,参考 data 和 vocab 目录,可用 pre_data/vocab.py 处理出词典(每个 data 文件不能太大,否则内存不够)
  2. 训练模型 train_elmo.py
  3. 输出模型 dump_weights.py
  4. options.json 里的 261 改成 262
  5. 输出 word embedding 到 hdf5 文件 usage_token.py

实验结果

用可视化工具看合理,textmatch 任务提升 AUC 1-2。

License

MIT

elmo-chinese's People

Contributors

guotong1988 avatar yorkie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elmo-chinese's Issues

为什么chinese字符也encode成255个

UnicodeCharsVocabulary类中_convert_word_to_char_ids函数您注释是chinese也可以成为255个ids,但汉字字符应该明显多于255个呀,所以用utf解码变成ids是不是不合适?

请教使用方法

您好!
请问 options.json 里面的 261 改成 262 是什么原因?我看第五步 输出word embedding到hdf5文件 usage_token.py 好像没有用到这个 options.json 文件
多谢了!

足量的显存仍然出现了OOM。

image

76, in <module> main(args) File "train_elmo.py", line 66, in main train(options, data, n_gpus, tf_save_dir, tf_log_dir) File "/data/sde/jiaxin_hu/git_project/ELMo-chinese/bilm/training.py", line 766, in train allow_soft_placement=True)) as sess: File "/data/sde/jiaxin_hu/git_project/ELMo-chinese/bin/testenv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1494, in __init__ super(Session, self).__init__(target, graph, config=config) File "/data/sde/jiaxin_hu/git_project/ELMo-chinese/bin/testenv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 626, in __init__ self._session = tf_session.TF_NewSession(self._graph._c_graph, opts) tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

在 get_batch 时出现 bug

请问用您传的中文数据集跑elmo时出现这个错误

Traceback (most recent call last):
  File "train_elmo.py", line 72, in <module>
    main(args)
  File "train_elmo.py", line 62, in main
    train(options, data, n_gpus, tf_save_dir, tf_log_dir)
  File "/mnt/disk2/data/wp/Word2vec/model/ELMo-chinese/bin/bilm/training.py", line 838, in train
    for batch_no, batch in enumerate(data_gen, start=1):
  File "/mnt/disk2/data/wp/Word2vec/model/ELMo-chinese/bin/bilm/data.py", line 469, in iter_batches
    num_steps, max_word_length)
  File "/mnt/disk2/data/wp/Word2vec/model/ELMo-chinese/bin/bilm/data.py", line 311, in _get_batch
    :how_many]
ValueError: could not broadcast input array from shape (18,50) into shape (19,50)

最后发现是 bilm/data.py 下的 get_bacth 函数的:

  inputs[i, cur_pos:next_pos] = cur_stream[i][0][:how_many]
                if max_word_length is not None:
                    char_inputs[i, cur_pos:next_pos] = cur_stream[i][1][:
                                                                        how_many]

这一段报错。
由于没看懂 get_batch 的逻辑,自己不会改,请问能指点一下吗,谢谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.