Comments (7)
Hi @YHWmz ,
The 55.4 result is obtained by initializing from VQAv2 checkpoint. Can you confirm you are loading this checkpoint?
If directly tuned on OKVQA from pretraiend checkpoint, ~45 is expected.
Thanks.
from lavis.
Hi @YHWmz ,
The 55.4 result is obtained by initializing from VQAv2 checkpoint. Can you confirm you are loading this checkpoint?
If directly tuned on OKVQA from pretraiend checkpoint, ~45 is expected.
Thanks.
Thanks for your replay.
I find that the model initializing from 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_vqa_capfilt_large.pth' in LAVIS/lavis/models/base_model.py line 101.
I am not sure whether the ‘model_base_vqa_capfilt_large.pth’ is VQAv2 checkpoint?By the way, I didn't change the code in LAVIS but just run the script LAVIS/run_scripts/blip/train/train_okvqa.sh
from lavis.
I also try to run LAVIS/run_scripts/blip/train/train_aokvqa.sh and the model is initializing from 'model_base_vqa_capfilt_large.pth' as well. Then I find the result is ~50, which also lower than 56.2 in Benchmark.
from lavis.
Are you using the same configurations as in .yaml? Note that OKVQA and AOKVQA dataset sizes are quite small. If you change batch size etc, you may also need to adjust other hparams accordingly to reach the best results.
from lavis.
Oh, I change the bz to 12. Maybe that is the problem.
I'll let you know when I figure it out.
Thank u so much for the help~
from lavis.
Solved.
The problem is the bz.
Thanks for your help again.
from lavis.
Glad to hear.
from lavis.
Related Issues (20)
- Cannot install salesforce-lavis: No matching distribution found for contexttimer
- MSCOCO dataset ann['catpion'] has inconsistent data type between RetrievalEvalDataset and RetrievalDataset
- Can I load the "pretrain_vitL" model with a local path? HOT 3
- Why don't the ViT-L/14 models in (blip2 pretrain_vitL) and (blip2_t5 pretrain_flant5xl_vitL) have the same number of layers as when instantiation a BLIP2 model with vit_model = 'clip_L'?
- [instruct_blip] What is the possible method to get a instruct_blip with longer context length ~ 16k?
- Quantization Aware Training for Visual Encoder Model
- Show the training accuracy during training time
- BLIP2_Vicuna zero-shot performance on the AOKVQA dataset HOT 3
- BLIP-2 Low Memory Option Accelerate Error
- question about neg_out from blip pretrain files
- cant find ”load_model_and_preprocess“ HOT 1
- wondering why Q-Former is trained from scratch in stage2 (in blip2_opt.py)???
- Generate "OOOOOOOOOOOOOOOOOOOOO" instead of words HOT 3
- BLIP2 convert to onnx
- The Q-Former weights of X-InstructBLIP can not be download HOT 5
- Inconsistency between required Transformers version for BLIP and version specified in requirements.txt HOT 2
- generate() got an unexpected keyword argument 'length_penalty'
- How I can fine-tune instructBlip for custom dataset ? HOT 1
- Discrn dataset of x-instructblip HOT 2
- When will MoonShot models will be made public ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lavis.