Awesome Hallucination Papers in MLLMs

A curated list of papers about hallucination in multi-modal large language models (MLLMs)

Survey Papers

This section collects the survey papers about MLLM's hallucination.

A Survey on Hallucination in Large Vision-Language Models [paper]

Arxiv 2024/02

Benchmark Papers

This section collects the benchmark papers on evaluating MLLM's hallucination.

Evaluating Object Hallucination in Large Vision-Language Models [paper] [code]

EMNLP 2023
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models [paper] [code]

CVPR 2024
Aligning Large Multimodal Models with Factually Augmented RLHF [paper] [code]

Arxiv 2023/09
An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation [paper] [code]

Arxiv 2023/11
Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges [paper] [code]

Arxiv 2023/11
Hallucination Benchmark in Medical Visual Question Answering [paper]

Arxiv 2024/01
The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs [paper] [code]

Arxiv 2024/02
Unified Hallucination Detection for Multimodal Large Language Models [paper] [code]

Arxiv 2024/02
Visual Hallucinations of Multi-modal Large Language Models [paper] [code]

Arxiv 2024/02
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models [paper]

Arxiv 2024/02
PhD: A Prompted Visual Hallucination Evaluation Dataset [paper] [code]

Arxiv 2024/03
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models [paper] [code]

Arxiv 2024/04
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models [paper]

Arxiv 2024/05

Hallucination Mitigation

This section collects the papers on mitigating the MLLM's hallucination.

Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning [paper] [code]

ICLR 2024
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models [paper] [code]

ICLR 2024
VIGC: Visual Instruction Generation and Correction [paper][code]

AAAI 2024
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation [paper] [code]

CVPR 2024
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding [paper] [code]

CVPR 2024
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model [paper]

CVPR 2024
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback [paper] [code]

CVPR 2024
Detecting and Preventing Hallucinations in Large Vision Language Models [paper]

Arxiv 2023/08
Evaluation and Analysis of Hallucination in Large Vision-Language Models [paper][code]

Arxiv 2023/08
CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning [paper]

Arxiv 2023/09
Evaluation and Mitigation of Agnosia in Multimodal Large Language Models [paper]

Arxiv 2023/09
Aligning Large Multimodal Models with Factually Augmented RLHF [paper] [code]

Arxiv 2023/09
HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption [paper]

Arxiv 2023/10
Woodpecker: Hallucination Correction for Multimodal Large Language Models [paper] [code]

Arxiv 2023/10
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [paper] [code]

Arxiv 2023/11
VOLCANO: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision [paper] [code]

Arxiv 2023/11
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization [paper]

Arxiv 2023/11
Mitigating Hallucination in Visual Language Models with Visual Supervision [paper]

Arxiv 2023/11
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites [paper] [code]

Arxiv 2023/12
MOCHa: Multi-Objective Reinforcement Mitigating Caption Hallucinations [paper] [code]

Arxiv 2023/12
Temporal Insight Enhancement: Mitigating Temporal Hallucination in Multimodal Large Language Models [paper]

Arxiv 2024/01
On the Audio Hallucinations in Large Audio-Video Language Models [paper]

Arxiv 2024/01
Skip \n: A simple method to reduce hallucination in Large Vision-Language Models [paper]

Arxiv 2024/02
Unified Hallucination Detection for Multimodal Large Language Models [paper] [code]

Arxiv 2024/02
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance [paper]

Arxiv 2024/02
EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models [paper]

Arxiv 2024/02
Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models [paper] [code]

Arxiv 2024/02
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective [paper] [code]

Arxiv 2024/02
Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding [paper]

Arxiv 2024/02
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding [paper]

Arxiv 2024/02
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding [paper] [code]

Arxiv 2024/03
Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective [paper]

Arxiv 2024/03
Debiasing Large Visual Language Models [paper]

Arxiv 2024/03
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models [paper]

Arxiv 2024/03
What if...?: Counterfactual Inception to Mitigate Hallucination Effects in Large Multimodal Models [paper]

Arxiv 2024/03
Multi-Modal Hallucination Control by Visual Information Grounding [paper]

Arxiv 2024/03
Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination [paper] [code]

Arxiv 2024/03
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art [paper]

Arxiv 2024/03
Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning [paper]

Arxiv 2024/03
Visual Hallucination: Definition, Quantification, and Prescriptive Remediations [paper]

Arxiv 2024/03
Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models [paper]

Arxiv 2024/03
Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding [paper]

Arxiv 2024/03
Automated Multi-level Preference for MLLMs [paper]

Arxiv 2024/05
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models [paper]

Arxiv 2024/05
VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap [paper]

Arxiv 2024/05
Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization [paper]

Arxiv 2024/05
Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning [paper]

Arxiv 2024/05
RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in LVLMs [paper]

Arxiv 2024/05
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification [paper]

Arxiv 2024/05
Mitigating Object Hallucination via Data Augmented Contrastive Tuning [paper]

Arxiv 2024/05
NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models [paper] [code]

Arxiv 2024/06
CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models [paper] [code]

Arxiv 2024/06
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models [paper]

Arxiv 2024/06

pritamqu / awesome-mllm-hallucination Goto Github PK

awesome-mllm-hallucination's Introduction

Awesome Hallucination Papers in MLLMs

Survey Papers

Benchmark Papers

Hallucination Mitigation

awesome-mllm-hallucination's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs