GithubHelp home page GithubHelp logo

stonybrooknlp / ircot Goto Github PK

View Code? Open in Web Editor NEW
109.0 21.0 14.0 2.06 MB

Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23

Home Page: https://arxiv.org/abs/2212.10509

License: Apache License 2.0

Jsonnet 46.28% Python 32.01% Shell 21.61% Dockerfile 0.10%
chain-of-thought large-language-models multi-step-reasoning question-answering multi-step-retrieval retrieval-augmented-qa

ircot's People

Contributors

harshtrivedi avatar some-random avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ircot's Issues

Data and trained model

Hi,

I have several questions regarding your work!

  1. It seems 2wikimultihopqa is not properly downloaded in raw_data.sh
  2. In the code, are you saving the trained model with best hyperaparmeters?
  3. What's the use of base_configs and instantiated_configs folders?

Thank you in advance.

Ongoing Maintenance and Setup Queries for ircot Project

Hey Harsh Trivedi,

I've been trying to get my local setup aligned with the ircot project, specifically the state of the repo at this commit: 8637316e5e94ba0a2493e5df7846f2f23f46eaef.

I'm running into a few hiccups trying to replicate the environment on my end. Just wondering if there have been any updates to the requirements.txt or any particular package versions that I should use for a smooth setup?

Thanks a lot for your help, and for all the awesome work you're putting out there!

Cheers,
Hippoley

The hyperparameters

Hi Harsh,
I am wondering for the 4 datasets, what's the K (the number of paragraphs to retrieve at each step) and M ( the number of distractor paragraphs) for IRCoT. Could you please provide the details? Thanks!

address.jsonnet file format and CUDA error

Hi,

I'm trying to reproduce the results, and I found llm_server_address.jsonnet and retriever_address.jsonnet necessary.
Can you provide an example scripts for these?

Also, I'm getting torch.cuda.OutOfMemoryError: CUDA out of memory error message. If you can give me some tips to prevent cuda error (e.g. where to reduce the batch size), that would be appreciated.

Thank you in advance :)

Question about the figure demonstration

Hi, thank you for the great work! I have a question for the figure demonstration in this repo (Figure 2 in the paper). The right hand side "Reason" step takes in the triplet of (Q, yellow documents retrieved by the Q, T1). However, if I understand your approach correctly, the input should actually be (Q, yellow documents retrieved by the Q and blue documents retrieved by the T1, T1)?

When indexing, Elasticsearch instance fails to connect to localhost

Hi,

Thank you for your awesome work and your kindness to share the code. I really love the idea of querying database using LLM generation response. It's very inspiring! :)

However, when I followed README.md I seem to have a little trouble with elasticsearch since it's my first time using it and I'm kinda confused about everything.
I successfully started elasticsearch server on port 9200, and the retriever server at port 8000, but stuck at indexing. When I run python retriever_server/build_index.py hotpotqa until this line

    es.indices.create(index=elasticsearch_index, ignore=400, body=paragraphs_index_settings)

It first shows the error

Traceback (most recent call last):                                                                                                        
  File "/home/guest/r11944026/anaconda3/envs/ircot/lib/python3.8/site-packages/urllib3/connectionpool.py", line 791, in urlopen           
    response = self._make_request(                                                                                                        
  File "/home/guest/r11944026/anaconda3/envs/ircot/lib/python3.8/site-packages/urllib3/connectionpool.py", line 537, in _make_request     
    response = conn.getresponse()                                                                                                         
  File "/home/guest/r11944026/anaconda3/envs/ircot/lib/python3.8/site-packages/urllib3/connection.py", line 461, in getresponse           
    httplib_response = super().getresponse()                                                                                              
  File "/home/guest/r11944026/anaconda3/envs/ircot/lib/python3.8/http/client.py", line 1322, in getresponse                               
    response.begin()                                                                                                                      
  File "/home/guest/r11944026/anaconda3/envs/ircot/lib/python3.8/http/client.py", line 303, in begin                                      
    version, status, reason = self._read_status()                                                                                         
  File "/home/guest/r11944026/anaconda3/envs/ircot/lib/python3.8/http/client.py", line 272, in _read_status                               
    raise RemoteDisconnected("Remote end closed connection without"                                                                       
http.client.RemoteDisconnected: Remote end closed connection without response    
During handling of the above exception, another exception occurred:                                                                       
...

and there're more errors left behind.

At the same time, the elasticsearch server log shows

[2023-10-24T15:14:41,827][WARN ][o.e.h.n.Netty4HttpServerTransport] [cuda8] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200, remoteAddress=/127.0.0.1:50652}

It seems to be a HTTP vs HTTPs problem. Therefore I tried brutely changing this line in build_index.py

    elastic_host = "localhost"

to

    elastic_host = "https://localhost"

but it still doesn't work.

Could you please give me a hand? I'll really appreciate it.

Note: I notice that I followed official installation guide and I'm using elastic search 8.10, which is different from your version. Could that possibly be the reason?

Contexts in processed_data

Hi, first of all, thank you for the great work!

I really enjoyed reading the paper, and the proposed idea with promising results was really interesting.

Now, I am trying to use this codebase for my own project and have a question about the processed_data.

In the processed_jsonl file (e.g., test_subsampled.jsonl), the contexts are already included for all datasets.

Are these contexts the result of BM25 with one retrieval? If not, how they are obtained?

If you can provide the answer to this question, it would be really useful.

Thank you so much!

Where and How is the reason-step implemented?

Hi,

I really appreciate your work and the delicately structured code!

In the paper, you mentioned that the reason-step generates next CoT sentence based on

  1. the question
  2. so far retrieved paragraphs, and
  3. CoT sentences

I wonder how are the three components combined? Did you simply concate the three, which means sth like concat(question, paragraph_1, paragraph_2, ..., CoT_sent_1, CoT_sent_2, ...)? Where does this part locate in the code?

I tried to looked up and it seems that you fetched the retrieved paragraphs in read_examples() in dataset_readers.py, where output_instance is returned as a list of dictionaries containing all relevant information for each paragraph.
And in inference_mode in configurable_inference.py somehow the whole reasoning and answering is finished. What happened here?

Also, I want to make sure that in this implementation, the unit of indexing / retrieval is the whole paragraph for a document, right? That means for each wikipedia article, we only have one entry in the database, instead of separating it into smaller chunks / passages.

Please feel free to correct me on any misunderstanding of mine. Thanks again for your effort 😊

How much does it cost to solve this problem

for GPT3 I wonder about the cost of money and the cost of time in 4 dataset
for flan-T5 I wonder about the cost of time in 4 datasets with different size
can you provide the actual data?

`2wikimultihopqa` Raw Data

Hi Harsh, the download/raw_data.sh script does not download (or extract?) the raw data of 2wikimultihopqa correctly, as I found an empty folder in raw_data/2wikimultihopqa. Could you please update the script? Thanks!

How do I know the call flow?

How do I know how to call each function? It all looks like some jsonnet operation. Do these jsonnet documents reflect the ircot execution process?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.