Comments (4)
Hi,
Is there a way to access the confidence of the generated caption?
@jhwang7628 I have a similar question. Did you figure out any way to get the confidence level?
from lavis.
Currently this is not natively supported. I have to look into whether this is possible for HF bert. This expects delay.
One alternative might be to forward the generated caption and the original image into the BLIP ITM model to get a matching score.
Thanks.
from lavis.
Currently this is not natively supported. I have to look into whether this is possible for HF bert. This expects delay.
One alternative might be to forward the generated caption and the original image into the BLIP ITM model to get a matching score.
Thanks.
Thanks @dxli94 for answering.
How can I get a confidence score for VQA using BLIP?
Also, How can I compare the results of VQA and report the accuracy of the model on my dataset ( I am only using a pre-trained model for now)?
from lavis.
Currently this is not natively supported. I have to look into whether this is possible for HF bert. This expects delay.
One alternative might be to forward the generated caption and the original image into the BLIP ITM model to get a matching score.
Thanks.
@dxli94 How i can use BLIP ITM on text and image to predict the confidence? Any resource or notebook can you share? Is it possible using lavis ?
from lavis.
Related Issues (20)
- How to use clip_l visual encoder in the instruct-blip2 ?
- [CLIP] cannot import name '_expand_mask'
- python: can't open file 'evaluate.py': [Errno 2] No such file or directory HOT 1
- Cannot install salesforce-lavis: No matching distribution found for contexttimer
- MSCOCO dataset ann['catpion'] has inconsistent data type between RetrievalEvalDataset and RetrievalDataset
- Can I load the "pretrain_vitL" model with a local path? HOT 3
- Why don't the ViT-L/14 models in (blip2 pretrain_vitL) and (blip2_t5 pretrain_flant5xl_vitL) have the same number of layers as when instantiation a BLIP2 model with vit_model = 'clip_L'?
- [instruct_blip] What is the possible method to get a instruct_blip with longer context length ~ 16k?
- Quantization Aware Training for Visual Encoder Model
- Show the training accuracy during training time
- BLIP2_Vicuna zero-shot performance on the AOKVQA dataset HOT 3
- BLIP-2 Low Memory Option Accelerate Error
- question about neg_out from blip pretrain files
- cant find ”load_model_and_preprocess“ HOT 1
- wondering why Q-Former is trained from scratch in stage2 (in blip2_opt.py)???
- Generate "OOOOOOOOOOOOOOOOOOOOO" instead of words HOT 3
- BLIP2 convert to onnx
- The Q-Former weights of X-InstructBLIP can not be download HOT 5
- Inconsistency between required Transformers version for BLIP and version specified in requirements.txt HOT 2
- generate() got an unexpected keyword argument 'length_penalty'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lavis.