Comments (6)
It seems I'm having issue with elasticsearch. I cannot access port 9200 on GCP. Let me resolve this issue and come back if the issue persists. Thank you again for your prompt responses.
from ircot.
Oops, it looks like those files didn't get added as they were in my .gitignore. I've added them now.
Regarding OOM: There is no batching happening anywhere in the inference. It's all one instance at a time. To reduce memory usage, however, there are two things you can do.
- Instead of default flan-t5 models, use the bf16 versions. Let's say you want to use
flan-t5-xxl-bf16
instead offlan-t5-xxl
, you'll need to change the occurrences of the former with later in the experiment config of your choice. E.g., for IRCoT QA Flan-T5-XXL MuSiQue run, you'll make that change in this file. Look at the file/config names in this folder and it should be clear. We did our T5-Flan-XXL experiments using A100 (80Gs) and the rest with A6000 (48Gs) without BF16. If you use BM16, you can do all experiments using 48Gs. From a few experiments I tried, using Huggingface's bf16 versions gives the same performance, but I haven't made an exhaustive comparison. - If you still can't fit it in your GPU memory, you can reduce the max number of token allowance for the model. You can do this by changing the
model_tokens_limit
in the experiment config (e.g., this one) from 6000 to a lower number. This will have some impact on the performance, but it may not be large depending on how much you have to reduce the context.
from ircot.
Thank you for quick & detailed explanation.
I'm new to FastAPI, and for the jsonnet files you provided, can I use http://localhost
when you are running on the server? I'm running the code on my GCP, but I wasn't sure if I'm properly running FastAPI.
When I run the script below:
./reproduce.sh ircot flan-t5-xxl hotpotqa
I get a message as follows:
Token indices sequence length is longer than the specified maximum sequence length for this model (558 > 512). Running this sequence through the model will result in indexing errors
Running inference on examples
0it [00:00, ?it/s]Post request didn't succeed. Will wait 20s and retry.
Post request didn't succeed. Will wait 20s and retry.
and the message repeats.
from ircot.
You can ignore Token indices sequence length is longer than the specified maximum sequence length for this model (558 > 512).
, it's coming from HF.
The Post request didn't succeed. Will wait 20s and retry.
means that your client cannot connect to the server. The client may not be able to connect for many reasons. So try putting a breakpoint at that point and see response = requests.post(url, json=params) ; print(response.text)
gives you. Feel free to post it here again if you need help.
from ircot.
Thank you for the answer,
The issue I'm having is with the retriever server.
I'm able to access localhost:8000
which returns
{
"message": "Hello! This is a retriever server."
}
However, when I run predict.py
code, it seems the code is post requesting tolocalhost:8000/retrieve
which says:
{
"detail": "Method Not Allowed"
}
I'm running your predict.py
and I'm getting the same error message (Post request didn't succeed. Will wait 20s and retry.) I'm getting is from ircot.py. I think I should not get "Method Not Allowed."
from ircot.
Can you confirm Method Not Allowed
message is not obtained by visiting the browser at localhost:8000/retrieve
and is instead obtained by putting a breakpoint in this line and seeing response.text
? The method not allowed says there is no available route to the requested path (/retrieve) and method (get, post, etc). If you visit it on the browser, it's a GET request whereas the server expects a POST request. Let me know if this was already based on response.text
and I can dig into it later what's going on.
Also, can you also confirm that your Elasticsearch server is running fine and you've already run the indexing scripts? You can check it by running curl localhost:9200/_cat/indices
. It should show different indices and their sizes (which should match up to what's given in the readme, but the exact size wouldn't be a failure).
from ircot.
Related Issues (20)
- The hyperparameters HOT 4
- Data and trained model HOT 2
- Question about the figure demonstration HOT 1
- `2wikimultihopqa` Raw Data HOT 1
- Contexts in processed_data HOT 1
- When indexing, Elasticsearch instance fails to connect to localhost HOT 4
- Where and How is the reason-step implemented? HOT 7
- 生成提示
- data_annotations是怎么生成的? HOT 1
- Ongoing Maintenance and Setup Queries for ircot Project HOT 1
- How do I know the call flow? HOT 1
- How the dataset in the experiment was passed into the framework? HOT 1
- Dataset encoding format HOT 1
- EXTREME WARNING: Not enough space to even fit in even the test example HOT 1
- 2wikimultihop without test label, how do you evaluate? HOT 1
- How is your retrieval recall calculated HOT 1
- How are prompts being picked up? HOT 5
- Query about the origin of hp_manager package HOT 2
- what prompt to create CoT sentences
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ircot.