Comments (2)
Hi @quickgrid , thanks for your interest.
You are right - 0/1 tensor is expected. 0 for False, 1 for True. The output is binary logit. If applying softmax on the logit, you'll get probabilities for False / True prediction.
Please see our NLVR dataset module for more details.
Thanks.
from lavis.
Thank you @dxli94 for quick response and clearing my confusion.
I have looked at the linked pytorch dataset code and an actual dataset sample. Now the input labels and outputs as predicated label binary logit makes sense.
One more question, how to get attention map of nlvr model in one or both images like this text localization example?
from lavis.
Related Issues (20)
- When using instructblip to evaluate the okvqa data set, there is nothing in the output path
- Compatibility Issue with Different Versions of Transformers
- instruct-blip output long meanless string HOT 1
- question about text localization
- How to fine-tune BLIP-2 on a local Chinese dataset? HOT 3
- ModuleNotFoundError: No module named 'lavis.models.blip_diffusion_models'
- Why do I always encounter CUDA out of memory problem when I load the load_model_process function? Can the RTX 3090 be used for the BLIP-2 model?" HOT 2
- The results of DocVQA, infoVQA, and OCRVQA evaluating the instructblip model are very low
- Score difference in ITM and ITC ?
- Can existing large datasets be used to fine tune the blip2 caption task?
- OPT2.7B underperforming & weird behavior compared to flant5xl on image captioning? HOT 5
- The role of modeling_opt.py in the BLIP2 model
- Image use to present LAVIS
- How to run InstructBLIP with other LLM model
- How can I calculate the similarity between multimodal features and Unimodal features
- Potentially wrong inherence in lavis.datasets.datasets.base_dataset
- Input of multiple images
- how use it output target class。
- how to deal with “Missing keys ” HOT 1
- huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: ''.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lavis.