GithubHelp home page GithubHelp logo

voidful / bdg Goto Github PK

View Code? Open in Web Editor NEW
27.0 27.0 4.0 2.25 MB

Code for "A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies."

Home Page: https://voidful.github.io/DG-Showcase/

Python 65.95% Jupyter Notebook 34.05%

bdg's Introduction

Sharing Knowledge and Ideas While Building a Better Future, Together.

  • 🚀 Discover my open-source projects and engage in exciting conversations!
  • 🤝 Seeking collaboration on free and open-source initiatives.

Interests:

Stay Updated with Arxiv CS.CL Daily Digest using phraseg

WordCloud

Huggingface    |    Medium    |    Explore My Open Source Work: ▼ ▼ ▼

Visitor Count

bdg's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

bdg's Issues

How did you preprocess the data for BART?

Hi, I'm following your wonderful work on distractor generation. May I know how did you preprocess the RACE dataset for the BART model ? In the instruction from README file, you mentioned the race_train_updated_cqa_dsep_a_bart.csv file. But I can't find the corresponding preprocess code in your convert_data.py script. Is it the same as race_train_updated_cqa_dsep_a.csv ? Thanks.

Error when running the demo

Thank you for sharing the project. I have encountered an error when running the ([https://github.com/voidful/BDG/blob/main/BDG_selection.ipynb].) I can't get the choices with ['result'] in prediction which shows keyerror. And when I print the choices it shows the error "Attention mask should be of size (1, 1, 1, 3), but is torch.Size([1, 1, 1, 1])".

Could you please help me with this?

How can I generate multiple distractors with the pretrained model?

I have try the pretrained models post on the HuggingFace, however, I can only get one distractor for each input sequence.
According to the paper, it would learn from previous generated distractor candidates, and produce multiple distractors.

How can I achieve that?

Unable to generate multiple distractors with pretrained model

@voidful Hi there,
I've tried to implement multiple distractors using the pretrained models post on Huggingface, but I'm still unable to get multiple distractors.

Here is my code:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM,BeamSearchScorer
  
tokenizer = AutoTokenizer.from_pretrained("voidful/bart-distractor-generation")

model = AutoModelForSeq2SeqLM.from_pretrained("voidful/bart-distractor-generation")
doc = " Demand The law of demand states that if all other factors remain equal, the higher the price of a good, the fewer people will demand that good. In other words, the higher the price, the lower the quantity demanded. The amount of a good that buyers purchase at a higher price is less because as the price of a good goes up, so does the opportunity cost of buying that good.As a result, people will naturally avoid buying a product that will force them to forgo the consumption of something else they value more. The chart below shows that the curve is a downward slope. Supply Like the law of demand, the law of supply demonstrates the quantities sold at a specific price. But unlike the law of demand, the supply relationship shows an upward slope. This means that the higher the price, the higher the quantity supplied. From the seller's perspective, each additional unit's opportunity cost tends to be higher and higher. Producers supply more at a higher price because the higher selling price justifies the higher opportunity cost of each additional unit sold. </s>  The higher the price of a good, the less people will demand that good? </s> the lower the quantity"
input_ids = tokenizer(doc, return_tensors="pt").input_ids
outputs = model.generate(input_ids=input_ids, num_beams=4)
print("Generated:", tokenizer.decode(outputs[0], skip_special_tokens=False))

Generated: </s>the higher the price</s>

As you can see, even when using num_beams, it's still not possible to generate multiple distractors.
Would you be able to provide a minimal example/ code as to how one can generate multiple distractors?
Thanks!

transformers import cached_path

whenever I am trying to run the cell its showing this error

ImportError Traceback (most recent call last)
Cell In [5], line 6
4 from torch.distributions import Categorical
5 import itertools as it
----> 6 import nlp2go
8 tokenizer = RobertaTokenizer.from_pretrained("LIAMF-USP/roberta-large-finetuned-race")
9 model = RobertaForMultipleChoice.from_pretrained("LIAMF-USP/roberta-large-finetuned-race")

File ~/distractor/venv/lib/python3.10/site-packages/nlp2go/init.py:1
----> 1 from .model import Model
2 from .main import parse_args

File ~/distractor/venv/lib/python3.10/site-packages/nlp2go/model.py:5
3 import nlp2
4 import tfkit
----> 5 from transformers import pipeline, pipelines, BertTokenizer, cached_path, AutoTokenizer
7 from nlp2go.modelhub import MODELMAP
8 from nlp2go.parser import Parser

ImportError: cannot import name 'cached_path' from 'transformers' (/home/amiya/distractor/venv/lib/python3.10/site-packages/transformers/init.py)

Different results with different tfkit version

Hi, I'm trying to reproduce your fantastic results based on BART model. I use the trained model you provided:
https://github.com/voidful/BDG/releases/download/v2.0/BDG_ANPM.pt

When I use tfkit==0.7.0(suggested by readme), I get the result like this:
{'Bleu_1': 0.4116063603355367, 'Bleu_2': 0.2629480211200134, 'Bleu_3': 0.19128546675900487, 'Bleu_4': 0.1484759134861437, 'ROUGE_L': 0.2184638476496905, 'CIDEr': 0.07954905358236805}
The value of ROUGE_L is much lower than the reported value, while the BLEU value is similar to the reported value. It takes me about half an hour for evaluation.

However, when I use tfkit==0.8.1(latest), I get the result like this:
'Bleu_1': 0.40226892712763984, 'Bleu_2': 0.2566475644205321, 'Bleu_3': 0.18535836171285228, 'Bleu_4': 0.14348238003117275, 'ROUGE_L': 0.3556143135035776, 'CIDEr': 0.6532226297900213
The value is similar to the reported one, but it takes much more time (about 2.5 hours) for evaluation on the same GPU, and the tqdm doesn't show the progress bar.

I was wondering why different tfkit versions would cause different results and different evaluation time. Which version should I use?
Thank you very much!

self.data_dir in the function evaluate()

Hello!
It seems like there is a typo in the code. Here is a snippet from the function evaluate():
with open(args.data_dir, "r", encoding='utf8') as f: inputjson = [json.loads(jline) for jline in f.readlines()]

But args.data_dir(), as its name suggests is not a file that can be opened, but a directory. Probably you wanted to use some other variable here. Could you please clarify this moment?
Thank you in advance!

Trained model

Hello, I'm looking for some project to see what can be done in this topic. Do you have the trained distractor generation model? It could be interesting in order to try a demo, or a colab notebook with some results. That could be really helpful. How long did you trained the model?

Thanks in advance.

the computation of BLEU score

Hi,

I'm wondering how did you compute the BLEU score of your paper? Did you take one generated distractor as input and the three actual distractors as golden answers?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.