GithubHelp home page GithubHelp logo

huawei-noah / pretrained-language-model Goto Github PK

View Code? Open in Web Editor NEW
3.0K 56.0 622.0 29.69 MB

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Dockerfile 0.05% Python 96.19% Shell 3.22% C++ 0.52% Cython 0.02%
knowledge-distillation model-compression quantization pretrained-models large-scale-distributed

pretrained-language-model's People

Contributors

dependabot[bot] avatar ghaddarabs avatar gowtham1997 avatar itachiuchihavictor avatar itsucks avatar jacobrxz avatar jiaxin-wen avatar jingmu123 avatar jxfeb avatar leoeaton avatar mengxj08 avatar mifei avatar sirily avatar xuqiongkai avatar zbravo avatar zwjyyc avatar zyy-g avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pretrained-language-model's Issues

TinyBert中,关于task_distill的损失函数中,各个层损失函数对应权重怎么初始化?

你好,看到论文中的公式(6),每一层的损失需要乘上当前层对应的超参数λ_m之后再求和,这里我理解的是这个λ_m是初始化的值,介于0到1之间,并且所有的λ加起来值为1。
1、请问实验中也的确是这么实现的吗?我在task_distill.py中没有看到有类似λ的变量,好像是直接把每一层的损失加起来?
2、如果在实现的时候的确有这样的λ_m的话,想问下当时是怎么初始化的?
3、初始化种子不同,对收敛后λ的最终值影响大吗?

感谢回答!
image

Release other variants of general TinyBert

Hi,

I found in the paper that you have experimented with four different variants of TinyBert (with different number of layers and dimension) , among which two have general distilled models released. Would it be possible to release general TinyBert for the other two variants(4layer-768dim and 6layer-312dim)?

Thanks

[TinyBert] ERROR, runing the task_distill during task-specific distill for a chinese task.

The ERROR happened during task-specific distill, Traceback is in the END. Fine-turn Bert model was generated using transformer package using the bert-base-chinese model, which included in the transformer package.

Is that because the release of TinyBERT's model trained using corpus without Chinese?

Fine-turn command using transformer as follow:

python run_glue.py   --model_type bert   \
--model_name_or_path bert-base-chinese  \
 --task_name sst-2   \
--do_train   \
--do_eval   \
--do_lower_case \
--data_dir /home/vigosser/nvidia/bert/data/final   \
--max_seq_length 128   \
--per_gpu_train_batch_size 8  \
 --learning_rate 15e-6   \
--num_train_epochs 3.0   \
--output_dir /home/vigosser/TinyBERT/FT_bert

Traceback

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 2066, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 2060, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 1411, in run
    return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
  File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\pydevd.py", line 1418, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2019.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "D:/github/TinyBERT/task_distill.py", line 1154, in <module>
    main()
  File "D:/github/TinyBERT/task_distill.py", line 1013, in main
    teacher_logits, teacher_atts, teacher_reps = teacher_model(input_ids, segment_ids, input_mask)
  File "C:\Users\vigosser\Anaconda3\envs\vai\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\github\TinyBERT\transformer\modeling.py", line 1133, in forward
    output_all_encoded_layers=True, output_att=True)
  File "C:\Users\vigosser\Anaconda3\envs\vai\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\github\TinyBERT\transformer\modeling.py", line 832, in forward
    embedding_output = self.embeddings(input_ids, token_type_ids)
  File "C:\Users\vigosser\Anaconda3\envs\vai\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "D:\github\TinyBERT\transformer\modeling.py", line 357, in forward
    words_embeddings = self.word_embeddings(input_ids)
  File "C:\Users\vigosser\Anaconda3\envs\vai\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\vigosser\Anaconda3\envs\vai\lib\site-packages\torch\nn\modules\sparse.py", line 114, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "C:\Users\vigosser\Anaconda3\envs\vai\lib\site-packages\torch\nn\functional.py", line 1484, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: index out of range: Tried to access index 21397 out of table with 21127 rows. at C:\w\1\s\tmp_conda_3.7_112106\conda\conda-bld\pytorch_1572952932150\work\aten\src\TH/generic/THTensorEvenMoreMath.cpp:418

Task-specific Distillation 的step1中load_tf_weights_in_bert出现下面的问题

Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'key', 'bias']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'key', 'kernel']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'query', 'bias']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'query', 'kernel']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'value', 'bias']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'value', 'kernel']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'intermediate', 'dense', 'bias']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'intermediate', 'dense', 'kernel']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'LayerNorm', 'beta']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'LayerNorm', 'gamma']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'dense', 'bias']
Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'dense', 'kernel']
Initialize PyTorch weight ['bert', 'pooler', 'dense', 'bias']
Initialize PyTorch weight ['bert', 'pooler', 'dense', 'kernel']
Skipping cls/predictions/output_bias
Skipping cls/predictions/output_bias
Skipping cls/predictions/output_bias
Traceback (most recent call last):
File "task_distill.py", line 1162, in
main()
File "task_distill.py", line 927, in main
teacher_model = TinyBertForSequenceClassification.from_pretrained(args.teacher_model, num_labels=num_labels, from_tf=True)
File "/mnt/disk0/home/xx/project/demo/tinybert/TinyBERT/transformer/modeling.py", line 706, in from_pretrained
return load_tf_weights_in_bert(model, weights_path)
File "/mnt/disk0/home/xx/project/demo/tinybert/TinyBERT/transformer/modeling.py", line 119, in load_tf_weights_in_bert
assert pointer.shape == array.shape
File "/home/xx/install/anaconda3/envs/torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 576, in getattr
type(self).name, name))
AttributeError: 'TinyBertForSequenceClassification' object has no attribute 'shape'

Some errors in "run_classifier_ner.py" ?

脚本run_classifier_ner.py 中从第1105行到第1115行的缩进有问题吧?这块代码应该包含在第1076行的else循环中?
result = estimator.evaluate(input_fn=predict_input_fn, checkpoint_path=FLAGS.init_checkpoint) # predict output_predict_file = os.path.join(FLAGS.output_dir, "test_results.tsv") eval_re_path = os.path.join(FLAGS.output_dir, "eval") if not os.path.exists(eval_re_path): os.mkdir(eval_re_path) output_test_file = os.path.join(eval_re_path, "test_results.txt") with tf.gfile.GFile(output_test_file, "w") as writer: tf.logging.info("***** Test results *****") for key in sorted(result.keys()): tf.logging.info(" %s = %s", key, str(result[key])) writer.write("%s = %s\n" % (key, str(result[key])))

Question towards TinyBERT Data Augmentation ${GLOVE_EMB}$

Hi, all

In the part of Data Augmentation I have seen “--glove_embs ${GLOVE_EMB}$”, I am wondering what should I use to replace this part: "${GLOVE_EMB}$"

I have noticed from the code in the data_augmentation.py, it mentioned it is the glove embedding file. If we should replace "${GLOVE_EMB}$" with the location of the glove embedding file.

May I know where can we get the glove embedding file? Could you provide me with a link?

中文版蒸馏模型

  1. 官方最近有更新中文版模型计划吗?
  2. 有人自己训练蒸馏中文版模型吗?蒸馏模型做下游任务效果如何?

Multi-languages support ?

I was wondering whether it is possible to use this model for other languages (e.g. French)?

I checked the vocab.txt, probably it is for English for now?

run_classifier.py文件中输出test_result.tsv时报错TypeError: string indices must be integers

进行文本分类任务时,想把test的结果打印出来看,就把run_classifier.py中最后的注释解掉运行,发现报错 probabilities = prediction["probabilities"] TypeError: string indices must be integers。
后来发现是和打印“test_result.txt”的result混淆了,上面的是

result = estimator.evaluate(input_fn=predict_input_fn, checkpoint_path=FLAGS.init_checkpoint)

是模型评估的结果,实例预测打印应该用estimator.predict:

result1 = estimator.predict(input_fn=predict_input_fn)
for (i, prediction) in enumerate(result1):
     probabilities = prediction["probabilities"]

这样test_result.tsv就在output的任务目录下面了。

TinyBert distillation on downstream task : Why 2 steps ?

From the paper :

In our experiments, we firstly perform intermediate layer distillation (M ≥ m ≥ 0), then perform
the prediction-layer distillation (m = M + 1).


Why is it necessary to perform the downstream task distillation in 2 separate steps ?

Is it possible to distill on downstream task in one single step, using both loss at the same time ?
Or maybe doing this is more difficult for the model to converge ?

inference time

在论文中的inference time单位是s, 不会几十秒吧,是毫秒(ms)吧?

tinybert预训练蒸馏两个问题

  1. 预训练蒸馏只有attention和encoder_layer loss, 好像没有mask lm的loss?
  2. 如果没有mask lm的loss, 怎么直接测试蒸馏好的小模型效果?

tinyBert general model with `cased`

Hello,

Have you done general distillation using the bert-base-cased model?
and would you have the General_TinyBERT_v2(4layer-312dim) cased model available?

When trying python3 task_distill.py --teacher_model $FT_BERT_BASE_DIR --student_model $GENERAL_TINYBERT_DIR ... on a Fine-Tuned model that is 'bert-cased',
a CUDA error is thrown

I would like to invite you to test NEZHA or TinyBERT on CLUE Benchmark!

Hi! Thank you for your great contribution. I am a founder member of CLUE(Language Understanding Evaluation benchmark for Chinese), a group aimed at promoting the development of Chinese language model.
Recently, we have opened a leaderboard, including 10 different and varied datasets in Chinese. We hope you can use NEZHA or TinyBERT to test these tasks in our platform and promote the development of Chinese language model together : )
Thank you again!

Our Github: https://github.com/CLUEbenchmark/CLUE
Leaderboard system: https://www.cluebenchmarks.com/

词典大小对不上

Task-specific Distillation阶段,teacher是fine-tuned bert base,student是general_tinybert,两者都是由bert base而来,bert base词典大小是21128,但是为啥下载的general_tinybert词典是30522?两者怎么对齐?
在task-specific distill阶段,student词典较大,输入到teacher会造成index越界。

student_logits, student_atts, student_reps = student_model(input_ids, segment_ids, input_mask, is_student=True)
teacher_logits, teacher_atts, teacher_reps = teacher_model(input_ids, segment_ids, input_mask)

teacher和student的hidden_size不同时,fit_size作用

假设teacher和student的hidden_size分别为d和d'
当d不等于d'时,利用student模型的fit_dense层,将d‘映射到和d一样的维度,使得student和teacher之间可以计算hidden_state loss。
但是当d和d'像当时,就可以不经过fit_dense映射直接计算hidden_state loss吧。但是代码里用了
if is_student判断,实际应该是判断d是否等于d'吧?

在运行task_distill时,加载tensorflow的fine-tune之后的bert模型参数报错了

在modeling.py里面的load_tf_weights_in_bert方法里面报错了,错误信息如下:

Traceback (most recent call last):
File "E:/TinyBERT/task_distill_training.py", line 1123, in
main()
File "E:/TinyBERT/task_distill_training.py", line 888, in main
teacher_model = TinyBertForSequenceClassification.from_pretrained(args.teacher_model, num_labels=num_labels)
File "E:\TinyBERT\transformer\modeling.py", line 702, in from_pretrained
return load_tf_weights_in_bert(model, weights_path)
File "E:\TinyBERT\transformer\modeling.py", line 112, in load_tf_weights_in_bert
pointer = pointer[num]
File "C:\Users\ThinkPad\Anaconda3\envs\python36\lib\site-packages\torch\nn\modules\container.py", line 137, in getitem
return self._modules[self._get_abs_string_index(idx)]
File "C:\Users\ThinkPad\Anaconda3\envs\python36\lib\site-packages\torch\nn\modules\container.py", line 128, in _get_abs_string_index
raise IndexError('index {} is out of range'.format(idx))
IndexError: index 10 is out of range

原以为是python版本的原因,但是已经试过python3.7和3.6的版本,都是一样的错,torch版本也都是1.0.1的,也试过只加载原始的没有fine-tune的bert模型参数,但也是这个错

Is there a plan to release tf-version tinyBert?

Hi!
THX for the great work! I'm looking for a pretrained model like bert but with higher inference speed. So I wonder if the tinyBert-torch model can be uesd by NEZHA or is there a plan to release a tf version tinyBert?
Thank you.

TinyBert : Why using several Linear layer / activations to get the logits ?

In TinyBert, you get the prediction logits by applying ReLu activation followed by Linear layer :

logits = self.classifier(torch.relu(pooled_output))

But why is this necessary ?

Because pooled_output is already a Linear layer followed by an activation function :

pooled_output = self.dense(pooled_output)
pooled_output = self.activation(pooled_output)
return pooled_output

TinyBERT的疑问

看过TinyBERT的论文后,想请教如下几个问题:
(1)预训练的蒸馏阶段,是指在预训练teacher BERT的同时蒸馏 student TinyBERT吗?比如每个epoch蒸馏一次或者其他?因为看到如下示意图,一开始觉得是预训练的同时进行蒸馏。
image
另一种是可能是预训练完BERT之后,固定teacher BERT,再用相同的预训练语料同时输入到teacher BERT和 要蒸馏出的TinyBERT?再逐个目标函数蒸馏?
(2)论文中似乎没有透露预训练和微调阶段的资源消耗,比如两阶段一共用了多少时间?
多谢!

TinyBert prediction-layer distillation loss : Why not KLDiv ?

For the prediction-layer distillation loss, you used the soft cross-entropy loss.

However, other distill architecture (Distil-Bert, Bert-PKD) used KLDiv loss for the prediction layer.


What is the reason behind this choice ?
Did you try KLDiv and it gave worse results ? Is it empirical choice ?

Data Augmentation

In the phase of Data Augmentation, pretrained_bert_model is General_TinyBERT in data_augmentation.py but is "pre-trained language model BERT" in the description.

蒸馏的效果问题?

 原始的教师网络通过fine-tune后的准确率大概在93%,使用大量未打标签数据输入到教师网络获取打标签数据,将这些数据输入到四层的bert(作为学生网络)中训练,以下两种情况:

(1)未添加中间层loss(atten、embebeding、encoder等),仅仅采用学生的硬标签作为loss,准确率为89%;
(2)添加中间层loss蒸馏,准确率为90%。
这说明中间层loss对最终学生网络准确率影响比较小?不知道TinyBert有测试损失函数添加对蒸馏准确率影响有多大?是不是我这边蒸馏的有问题呢?

INFO:tensorflow:Error recorded from training_loop: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ../nezha/model.ckpt

运行 Peoples-daily-NER 任务的时候出现问题:INFO:tensorflow:Error recorded from training_loop: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ../nezha/model.ckpt

(tensorflow-gpu2) [wgpu@localhost scripts]$ sh run_seq_labelling.sh /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) INFO:tensorflow:***********label_list of this task is ['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC', 'X', '[CLS]', '[SEP]'] WARNING:tensorflow:Estimator's model_fn (<function model_fn_builder.<locals>.model_fn at 0x7f21fb2a8ea0>) includes params argument, but params are not passed to Estimator. INFO:tensorflow:Using config: {'_model_dir': '../output/peoples-daily-ner/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f21fac42160>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None), '_cluster': None} INFO:tensorflow:_TPUContext: eval_on_tpu True WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False. INFO:tensorflow:Writing example 0 of 230 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: train-0 INFO:tensorflow:tokens: 当 希 望 工 程 救 助 的 百 万 儿 童 成 长 起 来 , 科 教 兴 国 蔚 然 成 风 时 , 今 天 有 收 藏 价 值 的 书 你 没 买 , 明 日 就 叫 你 悔 不 当 初 ! 藏 书 本 来 就 是 所 有 传 统 收 藏 门 类 中 的 第 一 大 户 , 只 是 我 们 结 束 温 饱 的 时 间 太 短 而 已 。 INFO:tensorflow:input_ids: 101 2496 2361 3307 2339 4923 3131 1221 4638 4636 674 1036 4997 2768 7270 6629 3341 8024 4906 3136 1069 1744 5917 4197 2768 7599 3198 8024 791 1921 3300 3119 5966 817 966 4638 741 872 3766 743 8024 3209 3189 2218 1373 872 2637 679 2496 1159 8013 5966 741 3315 3341 2218 3221 2792 3300 837 5320 3119 5966 7305 5102 704 4638 5018 671 1920 2787 8024 1372 3221 2769 812 5310 3338 3946 7653 4638 3198 7313 1922 4764 5445 2347 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: train-1 INFO:tensorflow:tokens: 因 有 关 日 寇 在 京 掠 夺 文 物 详 情 , 藏 界 较 为 重 视 , 也 是 我 们 收 藏 北 京 史 料 中 的 要 件 之 一 。 INFO:tensorflow:input_ids: 101 1728 3300 1068 3189 2167 1762 776 2966 1932 3152 4289 6422 2658 8024 5966 4518 6772 711 7028 6228 8024 738 3221 2769 812 3119 5966 1266 776 1380 3160 704 4638 6206 816 722 671 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 1 1 6 1 1 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 7 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: train-2 INFO:tensorflow:tokens: 我 们 藏 有 一 册 1 9 4 5 年 6 月 油 印 的 《 北 京 文 物 保 存 保 管 状 态 之 调 查 报 告 》 , 调 查 范 围 涉 及 故 宫 、 历 博 、 古 研 所 、 北 大 清 华 图 书 馆 、 北 图 、 日 伪 资 料 库 等 二 十 几 家 , 言 及 文 物 二 十 万 件 以 上 , 洋 洋 三 万 余 言 , 是 珍 贵 的 北 京 史 料 。 INFO:tensorflow:input_ids: 101 2769 812 5966 3300 671 1085 122 130 125 126 2399 127 3299 3779 1313 4638 517 1266 776 3152 4289 924 2100 924 5052 4307 2578 722 6444 3389 2845 1440 518 8024 6444 3389 5745 1741 3868 1350 3125 2151 510 1325 1300 510 1367 4777 2792 510 1266 1920 3926 1290 1745 741 7667 510 1266 1745 510 3189 841 6598 3160 2417 5023 753 1282 1126 2157 8024 6241 1350 3152 4289 753 1282 674 816 809 677 8024 3817 3817 676 674 865 6241 8024 3221 4397 6586 4638 1266 776 1380 3160 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 7 1 6 7 1 4 5 5 1 6 7 7 7 7 7 7 1 6 7 1 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 7 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: train-3 INFO:tensorflow:tokens: 以 家 乡 的 历 史 文 献 、 特 定 历 史 时 期 书 刊 、 某 一 名 家 或 名 著 的 多 种 出 版 物 为 专 题 , 注 意 精 品 、 非 卖 品 、 纪 念 品 , 集 成 系 列 , 那 收 藏 的 过 程 就 已 经 够 您 玩 味 无 穷 了 。 INFO:tensorflow:input_ids: 101 809 2157 740 4638 1325 1380 3152 4346 510 4294 2137 1325 1380 3198 3309 741 1149 510 3378 671 1399 2157 2772 1399 5865 4638 1914 4905 1139 4276 4289 711 683 7579 8024 3800 2692 5125 1501 510 7478 1297 1501 510 5279 2573 1501 8024 7415 2768 5143 1154 8024 6929 3119 5966 4638 6814 4923 2218 2347 5307 1916 2644 4381 1456 3187 4956 749 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: train-4 INFO:tensorflow:tokens: 我 们 是 受 到 郑 振 铎 先 生 、 阿 英 先 生 著 作 的 启 示 , 从 个 人 条 件 出 发 , 瞄 准 现 代 出 版 史 研 究 的 空 白 , 重 点 集 藏 解 放 区 、 国 民 党 毁 禁 出 版 物 。 INFO:tensorflow:input_ids: 101 2769 812 3221 1358 1168 6948 2920 7195 1044 4495 510 7350 5739 1044 4495 5865 868 4638 1423 4850 8024 794 702 782 3340 816 1139 1355 8024 4730 1114 4385 807 1139 4276 1380 4777 4955 4638 4958 4635 8024 7028 4157 7415 5966 6237 3123 1277 510 1744 3696 1054 3673 4881 1139 4276 4289 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 1 1 1 1 2 3 3 1 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 5 5 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:***** Running training ***** INFO:tensorflow: Num examples = 230 INFO:tensorflow: Batch size = 16 INFO:tensorflow: Num steps = 143 INFO:tensorflow:Writing example 0 of 50 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: dev-0 INFO:tensorflow:tokens: 美 国 的 华 莱 士 , 我 和 他 谈 笑 风 生 。 INFO:tensorflow:input_ids: 101 5401 1744 4638 1290 5812 1894 8024 2769 1469 800 6448 5010 7599 4495 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 6 7 1 2 2 2 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: dev-1 INFO:tensorflow:tokens: 看 包 公 断 案 的 戏 , 看 他 威 风 凛 凛 坐 公 堂 拍 桌 子 动 刑 具 , 多 少 还 有 一 点 担 心 , 总 怕 靠 这 一 套 办 法 弄 出 错 案 来 , 放 过 了 真 正 的 坏 人 ; 可 看 《 包 公 赶 驴 》 这 出 戏 , 心 里 就 很 踏 实 : 这 样 是 一 断 一 个 准 的 。 INFO:tensorflow:input_ids: 101 4692 1259 1062 3171 3428 4638 2767 8024 4692 800 2014 7599 1123 1123 1777 1062 1828 2864 3430 2094 1220 1152 1072 8024 1914 2208 6820 3300 671 4157 2857 2552 8024 2600 2586 7479 6821 671 1947 1215 3791 2462 1139 7231 3428 3341 8024 3123 6814 749 4696 3633 4638 1776 782 8039 1377 4692 517 1259 1062 6628 7723 518 6821 1139 2767 8024 2552 7027 2218 2523 6672 2141 8038 6821 3416 3221 671 3171 671 702 1114 4638 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: dev-2 INFO:tensorflow:tokens: 譬 如 看 《 施 公 案 》 , 施 大 人 坐 公 堂 问 案 子 不 得 要 领 , 总 是 扮 成 普 通 百 姓 深 入 民 间 暗 中 查 访 , 结 果 就 屡 破 奇 案 了 。 INFO:tensorflow:input_ids: 101 6357 1963 4692 517 3177 1062 3428 518 8024 3177 1920 782 1777 1062 1828 7309 3428 2094 679 2533 6206 7566 8024 2600 3221 2815 2768 3249 6858 4636 1998 3918 1057 3696 7313 3266 704 3389 6393 8024 5310 3362 2218 2249 4788 1936 3428 749 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: dev-3 INFO:tensorflow:tokens: 如 果 有 人 问 我 : [UNK] 你 看 过 许 多 包 公 戏 , 哪 一 出 最 好 ? [UNK] 我 要 毫 不 犹 豫 地 回 答 道 : [UNK] 自 然 是 《 包 公 赶 驴 》 啦 ! 包 公 毕 竟 是 包 公 , 若 是 换 了 好 摆 身 份 的 什 么 公 , 便 要 先 派 人 通 报 , 然 后 由 卫 士 前 呼 后 拥 而 去 , 如 何 查 得 出 实 情 ! [UNK] ( 马 得 / 画 ) 学 习 基 本 法 顺 利 迎 回 归 本 报 评 论 员 再 过 5 5 天 , 我 国 政 府 将 对 香 港 恢 复 行 使 主 权 。 INFO:tensorflow:input_ids: 101 1963 3362 3300 782 7309 2769 8038 100 872 4692 6814 6387 1914 1259 1062 2767 8024 1525 671 1139 3297 1962 8043 100 2769 6206 3690 679 4310 6499 1765 1726 5031 6887 8038 100 5632 4197 3221 517 1259 1062 6628 7723 518 1568 8013 1259 1062 3684 4994 3221 1259 1062 8024 5735 3221 2940 749 1962 3030 6716 819 4638 784 720 1062 8024 912 6206 1044 3836 782 6858 2845 8024 4197 1400 4507 1310 1894 1184 1461 1400 2881 5445 1343 8024 1963 862 3389 2533 1139 2141 2658 8013 100 8020 7716 2533 8027 4514 8021 2110 739 1825 3315 3791 7556 1164 6816 1726 2495 3315 2845 6397 6389 1447 1086 6814 126 126 1921 8024 2769 1744 3124 2424 2199 2190 7676 3949 2612 1908 6121 886 712 3326 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 1 1 1 1 1 2 3 1 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 7 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:*** Example *** INFO:tensorflow:guid: dev-4 INFO:tensorflow:tokens: 在 香 港 回 归 前 的 最 后 阶 段 , 中 共 中 央 举 办 《 [UNK] 一 国 两 制 [UNK] 与 香 港 基 本 法 》 讲 座 , 中 央 领 导 同 志 认 真 听 讲 , 虚 心 学 习 , 很 有 意 义 。 INFO:tensorflow:input_ids: 101 1762 7676 3949 1726 2495 1184 4638 3297 1400 7348 3667 8024 704 1066 704 1925 715 1215 517 100 671 1744 697 1169 100 680 7676 3949 1825 3315 3791 518 6382 2429 8024 704 1925 7566 2193 1398 2562 6371 4696 1420 6382 8024 5994 2552 2110 739 8024 2523 3300 2692 721 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:label_ids: 9 1 6 7 1 1 1 1 1 1 1 1 1 4 5 5 5 1 1 1 1 1 1 1 1 1 1 6 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 INFO:tensorflow:***** Running evaluation ***** INFO:tensorflow: Num examples = 50 INFO:tensorflow: Batch size = 16 INFO:tensorflow:Not using Distribute Coordinator. INFO:tensorflow:Running training and evaluation locally (non-distributed). INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 100 or save_checkpoints_secs None. WARNING:tensorflow:From /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From ../run_classifier_ner.py:564: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.experimental.map_and_batch(...). WARNING:tensorflow:From ../run_classifier_ner.py:545: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. INFO:tensorflow:Calling model_fn. INFO:tensorflow:Running train on CPU INFO:tensorflow:*** Features *** INFO:tensorflow: name = input_ids, shape = (16, 256) INFO:tensorflow: name = input_mask, shape = (16, 256) INFO:tensorflow: name = label_ids, shape = (16, 256) INFO:tensorflow: name = segment_ids, shape = (16, 256) WARNING:tensorflow:From /home/wgpu/deep/Pretrained-Language-Model/NEZHA/modeling.py:365: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version. Instructions for updating: Please use rateinstead ofkeep_prob. Rate should be set to rate = 1 - keep_prob. INFO:tensorflow:use_relative_position: True WARNING:tensorflow:From /home/wgpu/deep/Pretrained-Language-Model/NEZHA/modeling.py:908: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.dense instead. INFO:tensorflow:Error recorded from training_loop: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ../nezha/model.ckpt INFO:tensorflow:training_loop marked as finished WARNING:tensorflow:Reraising captured error Traceback (most recent call last): File "../run_classifier_ner.py", line 1124, in <module> tf.app.run() File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "../run_classifier_ner.py", line 1035, in main tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py", line 471, in train_and_evaluate return executor.run() File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py", line 611, in run return self.run_local() File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py", line 712, in run_local saving_listeners=saving_listeners) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2457, in train rendezvous.raise_errors() File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors six.reraise(typ, value, traceback) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/six.py", line 696, in reraise raise value File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2452, in train saving_listeners=saving_listeners) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1154, in _train_model_default features, labels, model_fn_lib.ModeKeys.TRAIN, self.config) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2251, in _call_model_fn config) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2534, in _model_fn features, labels, is_export_mode=is_export_mode) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1323, in call_without_tpu return self._call_model_fn(features, labels, is_export_mode=is_export_mode) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1593, in _call_model_fn estimator_spec = self._model_fn(features=features, **kwargs) File "../run_classifier_ner.py", line 705, in model_fn ) = modeling.get_assignment_map_from_checkpoint(tvars, init_checkpoint) File "/home/wgpu/deep/Pretrained-Language-Model/NEZHA/modeling.py", line 336, in get_assignment_map_from_checkpoint init_vars = tf.train.list_variables(init_checkpoint) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 95, in list_variables reader = load_checkpoint(ckpt_dir_or_file) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 64, in load_checkpoint return pywrap_tensorflow.NewCheckpointReader(filename) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 326, in NewCheckpointReader return CheckpointReader(compat.as_bytes(filepattern), status) File "/home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__ c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ../nezha/model.ckpt /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/wgpu/.conda/envs/tensorflow-gpu2/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Traceback (most recent call last): File "../read_tf_events.py", line 24, in <module> events_name_list = os.listdir(os.path.join(args.task_output_dir, "eval")) FileNotFoundError: [Errno 2] No such file or directory: '../output/peoples-daily-ner/eval'

Chinese TinyBERT

现在公开的bert是英文版的么,中文版的啥时候发布呀

Isn't TinyBert Equivalent to copying teacher Bert attention/hidden weights

Looking at Eq 7-9 in the paper (https://arxiv.org/pdf/1909.10351.pdf) and assuming that the student and teacher models have the same dimensionality (i.e. d=d') then how is TinyBert any different (better) than initializing a 4 layer Bert model with the hidden, embedding, and attention weights of the corresponding teacher model? Since you are doing a MSE() loss in eq 7-9 then the minimum of this loss is achieved (assuming d=d', M < N) when Ai_s = Ai_t, H_s=H_t, E_s=E_t => so whats the advantage of using TinyBert for general distillation (GD) (where you dont do the prediction layer Hinton distillation) over simply cloning the selected teacher layers onto the student? Maybe you can clarify?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.