castorini / birch Goto Github PK

View Code? Open in Web Editor NEW

142.0 142.0 30.0 872 KB

Document ranking via sentence modeling using BERT

Python 69.32% Shell 6.15% JavaScript 24.53%

birch's People

Contributors

Stargazers

Watchers

birch's Issues

BERT inference latency of CPU

I'd like to get some performance figures on latency on a CPU - queries per second, latency for each individual BERT inference, etc.

Add BERT inference code

We should be able to run ranking end-to-end, so we should fold BERT inference code into this repo.

robust04_cv.py

Can you provide the code 'robust04_cv.py' for us?

index_path

what is the index_path?
which file generates it ?
could it be generated by pyserini?

Snippet for interactive querying of BERT model

In the README, can I have a snippet for playing with BERT interactively? I.e., fire up Python interpreter, load model, issue a query and a sentence. Should be just a few lines, right?

downloading emnlp_bert4ir_v2 too slow

downloading emnlp_bert4ir_v2 too slow. Can you upload it to google drive.

Patch security issue

https://github.com/castorini/birch/network/alerts

I try to predict relevant sentences using the 'saved.msmarco_mb_1' model. One thing I am curious about is the prediction score I get from 'predictions = model(tokens_tensor, segments_tensor, mask_tensor)'. Each tuple in the 'predictions' does not sum to 1. Does it suppose to be a binary classification score?

list index out of range

I did it exactly according to readme, but there was a error... Could you tell me how to fix it? Thanks!

`Running eval/trec_eval.9.0.4/trec_eval data/qrels/qrels.mb.txt data/predictions/predict.tmp -m map -m P.20 -m ndcg_cut.20

Traceback (most recent call last):
File "src/main.py", line 92, in
main()

File "src/main.py", line 32, in main
train(args)

File "/home/castil/xueee/birch-master/src/model/train.py", line 53, in train
best_score = eval_select(args, model, tokenizer, validate_dataset, args.model_path, best_score, epoch)

File "/home/castil/xueee/birch-master/src/model/test.py", line 9, in eval_select
scores_dev = test(args, split='dev', model=model, test_dataset=validate_dataset)

File "/home/castil/xueee/birch-master/src/model/test.py", line 87, in test
qrels_file=os.path.join(args.data_path, 'qrels', 'qrels.{}.txt'.format(args.collection)))

File "/home/castil/xueee/birch-master/src/model/eval.py", line 14, in evaluate
map = float(lines[0].strip().split()[-1])

IndexError: list index out of range`

How to use interactive mode?

In interactive mode we're supposed to give the arg --interactive_name what is that supposed to be?

TREC Microblog Tracks Data

Hi,

Thank you for your nice work!

I took a look at the project but did not find test collections from the TREC Microblog Tracks (Lin et al., 2014) from 2011 to 2014, which were used to fine-tune BERT as described in your paper.

Could you please kindly let me know where I could find the collections?

Best,
Yumo

train process for bert is sentence-level or doc-level?

When training bert to get a query-doc score, are you using sentence-level or document-level? If sentence-level, what's the label for each example and how to choose the bert model with the dev set?
Looking forward to your early reply!

Do you have Colab interface for this

Dear Sir / Mam
According to this paper : Applying BERT to Document Retrieval with Birch
you have made a google Colab notebooks to run Birch >> Can I use them ? where are they ?

Regards

License

Thanks a lot for open sourcing birch. What is the license for the code in this repository?

Archive BERT model in zenodo

... from Simple Applications of BERT for Ad Hoc Document Retrieval

I can't find the file named robust04_rm3_5cv_sent_fields.txt

I put Anserini and Birch in the same directory, and I run "./train.sh mb 5" in shell. However, it return error "FileNotFoundError: [Errno 2] No such file or directory: 'robust04_rm3_5cv_sent_fields.txt' ", and I can't find the robust04_rm3_5cv_sent_fields.txt either.

Replication of "Simple Applications of BERT for Ad Hoc Document Retrieval"

I know @emmileaf is working on this, but I just wanted to have explicit documentation. Let's make sure we can replicate exactly the results in https://arxiv.org/abs/1903.10972

This would be a critical blocker to getting the birch image ready for OSIRRC.

How do I use birch on my own dataset?

Hi, thanks for this awesome work.

The first question is about document retrieval.
I'd like to use birch as a tool so I can retrive relavent documents given a query.
I am not clear how to birch to achieve such goals after reading the readme.
Can I have more instructions? Thank you!

The second is about sentence retrieval.
How do I use birch for sentence selection? Like what the Figure 2 describes in the 'Applying BERT to Document Retrieval with Birch', can I have the most relevant sentences in a document, given a query?

Make Birch pip install-able

Basically, what the title says...

Add instructions for replicating runs in "Simple Applications of BERT" paper

Add instructions for replicating results in: https://arxiv.org/abs/1903.10972

Investigate impact of FP precision on scores

Small difference (i.e: third decimal) in FP sentence scores leads to ~0.1-0.5 difference in AP. We also observe this in hyperparameter finetuning (currently addressed by picking the smaller numbers).

Relevant: https://cs.uwaterloo.ca/~jimmylin/publications/Lin_Yang_SIGIR2019.pdf

To ensure long-term availability, sample notebooks are linked from the main Birch repository

I can't find sample notebooks in this birch.

Set up evaluation from trained model

Prune test data to reduce inference time

Throw away all sentence that don't at least have a term that matches the sentence? Other pruning scenarios?

embedding of long text

Hi, thanks for your effort for providing this code,
I couldn't figure how to use your code for getting the embedding of a textual document ( with thousands of words), is it possible to do it with your framework?
Thanks

Training QA model with WikiQA and TrecQA

Hi,

Thanks again for your nice work!

I am quite interested in the QA model which was trained on the data described in your ArXiv paper as follows,

the union of the TrecQA (Yao et al., 2013) and WikiQA (Yang et al., 2015) datasets.

Since I am now also trying to train a similar model and have several minor questions, I would be really appreciative if you could kindly clarify them for me:

Did you use the union of the training sets from TrecQA and WikiQA as the training set, the union of their development sets for development? What about test sets, e.g., were they not used? Or, you integrated all the samples from TrecQA and WikiQA, the split it into train/dev manually?
In terms of TrecQA, did you use TRAIN (which contains only 94 questions) or TRAIN-ALL (which contains 1229 questions)?
In terms of WikiQA, did you truncate answer sentences to 40 tokens as one of the preprocessing steps (as introduced here: https://github.com/castorini/data/tree/master/WikiQA)?
I found the following default configurations in args.py; were they also used for training the QA model?
- epochs: 3
- learning rate: 3e-6
- batch_size: 32
- warmup_proportion: 0.1

Best,
Yumo

castorini / birch Goto Github PK

birch's People

Contributors

Stargazers

Watchers

Forkers

birch's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs