voidful / bdg Goto Github PK

View Code? Open in Web Editor NEW

27.0 27.0 4.0 2.25 MB

Code for "A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies."

Home Page: https://voidful.github.io/DG-Showcase/

Python 65.95% Jupyter Notebook 34.05%

bdg's Introduction

Sharing Knowledge and Ideas While Building a Better Future, Together.

🚀 Discover my open-source projects and engage in exciting conversations!
🤝 Seeking collaboration on free and open-source initiatives.

Interests:

Reinforcement Learning with Human Feedback TextRL
Multilingual Speech Recognition SpeechMix, ASR-Training, ASR-LiveDemo, asrp
Self-Supervised Representation Learning TFKit
Development Tools aidev, DevLEGO
Side Project Cipher, FTA, SnapShare, react-media-viewer

Stay Updated with Arxiv CS.CL Daily Digest using phraseg

Huggingface | Medium | Explore My Open Source Work: _{▼ ▼ ▼}

bdg's People

Stargazers

Watchers

Forkers

77216384 fossabot ishita-2097 taneset

bdg's Issues

Can 3 distractors be generated ?

Hey, if I give context, question and answer as input then I get one distractor. Is 3 distractors possible?

How did you preprocess the data for BART?

Hi, I'm following your wonderful work on distractor generation. May I know how did you preprocess the RACE dataset for the BART model ? In the instruction from README file, you mentioned the race_train_updated_cqa_dsep_a_bart.csv file. But I can't find the corresponding preprocess code in your convert_data.py script. Is it the same as race_train_updated_cqa_dsep_a.csv ? Thanks.

Error when running the demo

Thank you for sharing the project. I have encountered an error when running the ([https://github.com/voidful/BDG/blob/main/BDG_selection.ipynb].) I can't get the choices with ['result'] in prediction which shows keyerror. And when I print the choices it shows the error "Attention mask should be of size (1, 1, 1, 3), but is torch.Size([1, 1, 1, 1])".

Could you please help me with this?

How can I generate multiple distractors with the pretrained model?

I have try the pretrained models post on the HuggingFace, however, I can only get one distractor for each input sequence.
According to the paper, it would learn from previous generated distractor candidates, and produce multiple distractors.

How can I achieve that?

Error when running Colab demo

Hi, I encountered an error when running your provided Colab notebook.

Unable to generate multiple distractors with pretrained model

@voidful Hi there,
I've tried to implement multiple distractors using the pretrained models post on Huggingface, but I'm still unable to get multiple distractors.

Here is my code:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM,BeamSearchScorer
  
tokenizer = AutoTokenizer.from_pretrained("voidful/bart-distractor-generation")

model = AutoModelForSeq2SeqLM.from_pretrained("voidful/bart-distractor-generation")
doc = " Demand The law of demand states that if all other factors remain equal, the higher the price of a good, the fewer people will demand that good. In other words, the higher the price, the lower the quantity demanded. The amount of a good that buyers purchase at a higher price is less because as the price of a good goes up, so does the opportunity cost of buying that good.As a result, people will naturally avoid buying a product that will force them to forgo the consumption of something else they value more. The chart below shows that the curve is a downward slope. Supply Like the law of demand, the law of supply demonstrates the quantities sold at a specific price. But unlike the law of demand, the supply relationship shows an upward slope. This means that the higher the price, the higher the quantity supplied. From the seller's perspective, each additional unit's opportunity cost tends to be higher and higher. Producers supply more at a higher price because the higher selling price justifies the higher opportunity cost of each additional unit sold. </s>  The higher the price of a good, the less people will demand that good? </s> the lower the quantity"
input_ids = tokenizer(doc, return_tensors="pt").input_ids
outputs = model.generate(input_ids=input_ids, num_beams=4)
print("Generated:", tokenizer.decode(outputs[0], skip_special_tokens=False))

Generated: </s>the higher the price</s>

As you can see, even when using num_beams, it's still not possible to generate multiple distractors.
Would you be able to provide a minimal example/ code as to how one can generate multiple distractors?
Thanks!

transformers import cached_path

whenever I am trying to run the cell its showing this error

ImportError Traceback (most recent call last)
Cell In [5], line 6
4 from torch.distributions import Categorical
5 import itertools as it
----> 6 import nlp2go
8 tokenizer = RobertaTokenizer.from_pretrained("LIAMF-USP/roberta-large-finetuned-race")
9 model = RobertaForMultipleChoice.from_pretrained("LIAMF-USP/roberta-large-finetuned-race")

File ~/distractor/venv/lib/python3.10/site-packages/nlp2go/init.py:1
----> 1 from .model import Model
2 from .main import parse_args

File ~/distractor/venv/lib/python3.10/site-packages/nlp2go/model.py:5
3 import nlp2
4 import tfkit
----> 5 from transformers import pipeline, pipelines, BertTokenizer, cached_path, AutoTokenizer
7 from nlp2go.modelhub import MODELMAP
8 from nlp2go.parser import Parser

ImportError: cannot import name 'cached_path' from 'transformers' (/home/amiya/distractor/venv/lib/python3.10/site-packages/transformers/init.py)

Different results with different tfkit version

Hi, I'm trying to reproduce your fantastic results based on BART model. I use the trained model you provided:
https://github.com/voidful/BDG/releases/download/v2.0/BDG_ANPM.pt

When I use tfkit==0.7.0(suggested by readme), I get the result like this:
{'Bleu_1': 0.4116063603355367, 'Bleu_2': 0.2629480211200134, 'Bleu_3': 0.19128546675900487, 'Bleu_4': 0.1484759134861437, 'ROUGE_L': 0.2184638476496905, 'CIDEr': 0.07954905358236805}
The value of ROUGE_L is much lower than the reported value, while the BLEU value is similar to the reported value. It takes me about half an hour for evaluation.

However, when I use tfkit==0.8.1(latest), I get the result like this:
'Bleu_1': 0.40226892712763984, 'Bleu_2': 0.2566475644205321, 'Bleu_3': 0.18535836171285228, 'Bleu_4': 0.14348238003117275, 'ROUGE_L': 0.3556143135035776, 'CIDEr': 0.6532226297900213
The value is similar to the reported one, but it takes much more time (about 2.5 hours) for evaluation on the same GPU, and the tqdm doesn't show the progress bar.

I was wondering why different tfkit versions would cause different results and different evaluation time. Which version should I use?
Thank you very much!

self.data_dir in the function evaluate()

Hello!
It seems like there is a typo in the code. Here is a snippet from the function evaluate():
with open(args.data_dir, "r", encoding='utf8') as f: inputjson = [json.loads(jline) for jline in f.readlines()]

But args.data_dir(), as its name suggests is not a file that can be opened, but a directory. Probably you wanted to use some other variable here. Could you please clarify this moment?
Thank you in advance!

Trained model

Hello, I'm looking for some project to see what can be done in this topic. Do you have the trained distractor generation model? It could be interesting in order to try a demo, or a colab notebook with some results. That could be really helpful. How long did you trained the model?

Thanks in advance.

the computation of BLEU score

Hi,

I'm wondering how did you compute the BLEU score of your paper? Did you take one generated distractor as input and the three actual distractors as golden answers?

voidful / bdg Goto Github PK

bdg's Introduction

Sharing Knowledge and Ideas While Building a Better Future, Together.

Stay Updated with Arxiv CS.CL Daily Digest using phraseg

bdg's People

Stargazers

Watchers

Forkers

bdg's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs