cooelf / awesomemrc Goto Github PK
View Code? Open in Web Editor NEWIJCAI 2021 Tutorial & code for Retrospective Reader for Machine Reading Comprehension (AAAI 2021)
Home Page: https://arxiv.org/abs/2005.06249
IJCAI 2021 Tutorial & code for Retrospective Reader for Machine Reading Comprehension (AAAI 2021)
Home Page: https://arxiv.org/abs/2005.06249
Has anyone ran sh_albert_cls.sh and got a strange score, keeping 0.499 unchanged
请问运行三个sh脚本需要怎样的环境配置和准备工作?
非常感谢您开源了论文相关的源代码!不过在复现您的论文时,我遇到了一些问题,十分感谢您拨冗为我解答。
我根据 transformers 中的指示,通过源代码安装了 transformer v2.3.0
,并安装了 examples
中所需要的相关依赖。
但是当我执行 ./sh_albert_cls.sh
时,得到了以下错误信息:
Traceback (most recent call last):
File "./examples/run_cls.py", line 645, in <module>
main()
File "./examples/run_cls.py", line 498, in main
raise ValueError("Task not found: %s" % (args.task_name))
ValueError: Task not found: squad
我发现在 sh_albert_cls.sh
中,指定了 export TASK_NAME=squad
并且执行 python ./examples/run_cls.py
时带上了参数 --task_name $TASK_NAME
。
但是当我执行 python ./examples/run_cls.py -h
时,得到了以下输出:
--task_name TASK_NAME
The name of the task to train selected in the list: cola, mnli, mnli-mm, mrpc, sst-2, sts-b, qqp, qnli, rte, wnli
即似乎 python ./examples/run_cls.py
并不支持 --task_name squad
作为参数?
十分疑惑我是在哪一步出现了问题,提前感谢您的任何回复!
你好,我读了你们的Retrospective Reader论文,在Rear Verification节有一个疑惑,公式(12)的\hat{y}和\bar{y}分别指什么?我看了你们的代码(run_cls.py,run_squad_av.py和run_verifier.py),貌似没找到对应公式(12)的代码
请问有试过什么中文训练集吗?准确率到多少?
Can you please help me in reference to how we need to specify task name and what it is.
Please help me with step by step implementation of the code.
Its not able to take squad when i run sh_albert_cls.sh
问题描述:
RuntimeError: Found param albert.embeddings.word_embeddings.weight with type torch.FloatTensor, expected torch.cuda.FloatTensor.
When using amp.initialize, you need to provide a model with parameters
located on a CUDA device before passing it no matter what optimization level
you chose. Use model.to('cuda') to use the default device.
问题描述来自AwesomeMRC/transformer-mrc/examples/run_squad_av.py的90行。
https://github.com/ThilinaRajapakse/simpletransformers/issues/32与我问题相同,他提供一种解决方法,但是只有一行代码,我想问一下这是否可行,这行代码应该加在哪里?
如果这种方法不可行的话,有没有其他的方法。
I fixed every env problem to get the code to run on colab, even with 14G memory, it still crashes with
RuntimeError: CUDA out of memory. Tried to allocate 192.00 MiB (GPU 0; 14.76 GiB total capacity; 13.46 GiB already allocated; 17.75 MiB free; 13.71 GiB reserved in total by PyTorch)
I moved to colab in the first place, because I thought my own VT100 may be too small. I cut the batch size all the way to 1 and it still crashes with oom.
How can this code use so much memory?
how to fix this?
File "./examples/run_squad_av.py", line 19, in
from transformers.data.processors.squad import SquadV1Processor, SquadV2Processor, SquadResult
ModuleNotFoundError: No module named 'transformers'
Once I install Huggingface transformers library it says
2020-06-01 14:44:44.649987: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
File "./examples/run_squad_av.py", line 21, in
from examples.evaluate_official2 import eval_squad
ModuleNotFoundError: No module named 'examples.evaluate_official2'
您好!
关于 sequence_output = self.albert_att(context_sequence_output, sequence_output, context_attention_mask) #这里context_sequence_output是question(第一个序列)
这里我不太明白,请问原始 feature 中 question 在左 context 在右对吧?
"context_sequence_output是question"是指经过 split_ques_context 翻转了吗?
多谢了!
你好,请问你的代码支持tpu吗
I try to run the run_verifier.py
file but I got this error.
python run_verifier.py
Traceback (most recent call last):
File "run_verifier.py", line 85, in <module>
main()
File "run_verifier.py", line 82, in main
get_score1(args)
File "run_verifier.py", line 40, in get_score1
mean_score += score
TypeError: unsupported operand type(s) for +=: 'float' and 'list'
Do you guys know what happens?
你好,能否提供这个项目对应的依赖环境
Can you post requirment.txt
Hi there, I am a Lead researcher in NLMatics. I am monitoring the squad2 leaderboard continuously and noticed your work since later January. It is a remarkable achievement and I am eager to try it out. Would you please let me know when the code will be released? Thanks -- Yi
我在运行sh_albert_av.sh时,最后报错提示"No module named 'amp_C'",并且还会梯度爆炸,请问是什么原因,该如何解决呢?
错误如下:
Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods.
Defaults for this optimization level are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'",)
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 32768.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 16384.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4096.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 8192.0
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 4096.0
运行run_squad_av.py时,出现很多Missing prediction(“Missing prediction for 56ddde6b9a695914005b9629"),然后就报错了。
('exact', 100.0 * sum(exact_scores[k] for k in qid_list) / total),
KeyError: '56ddde6b9a695914005b9629'
我用vs code调试,问题定位到evaluate_official2.py中make_eval_dict函数中的这行代码return collections.OrderedDict()。
return collections.OrderedDict([
('exact', 100.0 * sum(exact_scores[k] for k in qid_list) / total),
('f1', 100.0 * sum(f1_scores[k] for k in qid_list) / total),
('total', total),
qid_list有5000多个对象,而exact-scores只有35个对象,所以计算sum()时,会出现k在exact_scores里没有对应的值,程序就报错了。
但是evaluate_official2.py好像是官方的代码,请问这怎么解决?还是我其他地方调错了。求指导!!!
PS: run_cls.py正常运行,基础环境应该没问题,run_squad_av.py的参数设置如下:
"args":[
"--model_type","albert",
"--model_name_or_path","albert-base-v2",
"--do_train",
"--do_eval",
"--do_lower_case",
"--version_2_with_negative",
"--data_dir","/home/amax/anaconda3/AwesomeMRC/transformer-mrc/data/",
"--train_file","/home/amax/anaconda3/AwesomeMRC/transformer-mrc/data/train-v2.0.json",
"--predict_file","/home/amax/anaconda3/AwesomeMRC/transformer-mrc/data/dev-v2.0.json",
"--learning_rate","2e-5",
"--num_train_epochs","2",
"--max_seq_length","512",
"--doc_stride","128",
"--max_query_length","64",
"--per_gpu_train_batch_size","6",
"--per_gpu_eval_batch_size","8",
"--warmup_steps","814",
"--output_dir","squad/squad2_albert-base-v2_lr2e-5_len512_bs48_ep2_wm814_av_ce_fp16",
"--eval_all_checkpoints",
"--save_steps","2500",
"--n_best_size","20",
"--max_answer_length","30",
"--fp16",
"--overwrite_output_dir"
]
I am running sh_albert_cls.sh. It crashed with
Iteration: 0%| | 0/10860 [00:03<?, ?it/s]
Epoch: 0%| | 0/2 [00:03<?, ?it/s]
Traceback (most recent call last):
File "./examples/run_cls.py", line 645, in
main()
File "./examples/run_cls.py", line 533, in main
global_step, tr_loss = train(args, train_dataset, model, tokenizer)
File "./examples/run_cls.py", line 159, in train
outputs = model(**inputs)
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
StopIteration: Caught StopIteration in replica 0 on device 0.
Original Traceback (most recent call last):
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/bruce/AwesomeMRC/transformer-mrc/transformers/modeling_albert.py", line 688, in forward
inputs_embeds=inputs_embeds
File "/data/anaconda3/envs/mrc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/bruce/AwesomeMRC/transformer-mrc/transformers/modeling_albert.py", line 524, in forward
extended_attention_mask = extended_attention_mask.to(dtype=next(self.parameters()).dtype) # fp16 compatibility
StopIteration
I commented out the argument
But still get the same error.
The messages are really not telling me much what's wrong. Any ideas?
Hello -
This is an amazing project. But a quick question.
Is there any straightforward way to use a pre-trained model and run the predictions on a text? For eg:
Something like:
python run_squad.py --model_type="xlnet" --model_name="XLNetForQuestionAnswering" --context="There are 29 countries and 1983 states in the world" --question="How many states are there in the world?"
Or any other way that I can use the pre-trained model to run on the text I provide?
Thanks for sharing!
As mentioned in the paper, your code of ELECTRA is based on the TF Release. But I didn't find it in this repo. Could you please share the experiment code of ELECTRA, too? Thanks!
Hi, thank you for releasing your codes and paper. May I know what parameters did you used for the electra (https://github.com/google-research/electra) models?
The default parameters for the google electra are different from that stated in the paper. Also, using the default parameters for google electra, the prediction module for the squad2 task is also different from what's stated in the paper.
Hi there,
Thank you for your great code. Unfortunately, after training and evaluating, I got these results:
{
"exact": 85.48808220331846,
"f1": 88.88667867457023,
"total": 11873,
"HasAns_exact": 83.23211875843455,
"HasAns_f1": 90.03905801335617,
"HasAns_total": 5928,
"NoAns_exact": 87.73759461732548,
"NoAns_f1": 87.73759461732548,
"NoAns_total": 5945
}
As opposed to these ones you reported:
{
"exact": 87.75372694348522,
"f1": 90.91630165754992,
"total": 11873,
"HasAns_exact": 83.1140350877193,
"HasAns_f1": 89.4482539777485,
"HasAns_total": 5928,
"NoAns_exact": 92.38015138772077,
"NoAns_f1": 92.38015138772077,
"NoAns_total": 5945
}
I ran these commands:
./sh_albert_cls.sh
./sh_albert_av.sh
python3 run_verifier.py --input_null_files="squad/cls_squad2_albert-xxlarge-v2_lr2e-5_len512_bs48_ep2_wm814/cls_score.json,squad/squad2_albert-xxlarge-v2_lr2e-5_len512_bs48_ep2_wm814_av_ce_fp16/null_odds_5_len512_bs48_ep2_wm814_av_ce_fp16.json" --input_nbest_files="squad/squad2_albert-xxlarge-v2_lr2e-5_len512_bs48_ep2_wm814_av_ce_fp16/nbest_predictions_5_len512_bs48_ep2_wm814_av_ce_fp16.json"
python3 evaluate-v2.0.py data/dev-v2.0.json predictions.json
Do you have any idea what may have went wrong? Thanks a lot.
We can not access the [model] at CodaLab for reproduction, can you provide the trained weights? Thank you!
I want to apply this model to get an answer from SQUAD like question answering, is there any way of doing? Can you just give me a direction on how to do that?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.