Recent Publications in Explainable AI

A repository containing recent explainable AI/Interpretable ML approaches

2015

Title	Venue	Year	Code	Keywords	Summary
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission	KDD	2015	N/A	``
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model	arXiv	2015	N/A	``

2016

Title	Venue	Year	Code	Keywords
Interpretable Decision Sets: A Joint Framework for Description and Prediction	KDD	2016	N/A	``
"Why Should I Trust You?": Explaining the Predictions of Any Classifier	KDD	2016	N/A	``
Towards A Rigorous Science of Interpretable Machine Learning	arXiv	2017	N/A	`Review Paper`

2017

Title	Venue	Year	Code	Keywords
Transparency: Motivations and Challenges	arXiv	2017	N/A	`Review Paper`
A Unified Approach to Interpreting Model Predictions	NeurIPS	2017	N/A	``
SmoothGrad: removing noise by adding noise	ICML (Workshop)	2017	Github	``
Axiomatic Attribution for Deep Networks	ICML	2017	N/A	``
Learning Important Features Through Propagating Activation Differences	ICML	2017	N/A	``
Understanding Black-box Predictions via Influence Functions	ICML	2017	N/A	``
Network Dissection: Quantifying Interpretability of Deep Visual Representations	CVPR	2017	N/A	``

2018

Title	Venue	Year	Code	Keywords
Explainable Prediction of Medical Codes from Clinical Text	ACL	2018	N/A	``
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)	ICML	2018	N/A	``
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR	HJTL	2018	N/A	``
Sanity Checks for Saliency Maps	NeruIPS	2018	N/A	``
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions	AAAI	2018	N/A	``
The Mythos of Model Interpretability	arXiv	2018	N/A	`Review Paper`
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead	Nature Machine Intelligence	2018	N/A	``

2019

Title	Venue	Year	Code	Keywords
Human Evaluation of Models Built for Interpretability	AAAI	2019	N/A	`Human in the loop`
Data Shapley: Equitable Valuation of Data for Machine Learning	ICML	2019	N/A	``
Attention is not Explanation	ACL	2019	N/A	``
Actionable Recourse in Linear Classification	FAccT	2019	N/A	``
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead	Nature	2019	N/A	``
Explanations can be manipulated and geometry is to blame	NeurIPS	2019	N/A	``
Learning Optimized Risk Scores	JMLR	2019	N/A	``
Explain Yourself! Leveraging Language Models for Commonsense Reasoning	ACL	2019	N/A	``
Deep Neural Networks Constrained by Decision Rules	AAAI	2018	N/A	``
Towards Automatic Concept-based Explanations	NeurIPS	2019	Github	``

2020

Title	Venue	Year	Code	Keywords
Interpreting the Latent Space of GANs for Semantic Face Editing	CVPR	2020	N/A	``
GANSpace: Discovering Interpretable GAN Controls	NeurIPS	2020	N/A	``
Explainability for fair machine learning	arXiv	2020	N/A	``
An Introduction to Circuits	Distill	2020	N/A	`Tutorial`
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses	NeurIPS	2020	N/A	``
Learning Model-Agnostic Counterfactual Explanations for Tabular Data	WWW	2020	N/A	``
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods	AIES (AAAI)	2020	N/A	``
Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning	CHI	2020	N/A	`Review Paper`
Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs	arXiv	2020	N/A	`Review Paper`
Human-Driven FOL Explanations of Deep Learning	IJCAI	2020	N\A	'Logic Explanations'
A Constraint-Based Approach to Learning and Explanation	AAAI	2020	N\A	'Mutual Information'

2021

Title	Venue	Year	Code	Keywords
A Learning Theoretic Perspective on Local Explainability	ICLR (Poster)	2021	N/A	``
A Learning Theoretic Perspective on Local Explainability	ICLR	2021	N/A	``
Do Input Gradients Highlight Discriminative Features?	NeurIPS	2021	N/A	``
Explaining by Removing: A Unified Framework for Model Explanation	JMLR	2021	N/A	``
Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience	PACMHCI	2021	N/A	``
Towards Robust and Reliable Algorithmic Recourse	NeurIPS	2021	N/A	``
A Framework to Learn with Interpretation	NeurIPS	2021	N/A	``
Algorithmic Recourse: from Counterfactual Explanations to Interventions	FAccT	2021	N/A	``
Manipulating and Measuring Model Interpretability	CHI	2021	N/A	``
Explainable Reinforcement Learning via Model Transforms	NeurIPS	2021	N/A	``
Aligning Artificial Neural Networks and Ontologies towards Explainable AI	AAAI	2021	N/A	``

2022

Title	Venue	Year	Code	Keywords
GlanceNets: Interpretabile, Leak-proof Concept-based Models	CRL	2022	N/A	``
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases	Transformer Circuit Thread	2022	N/A	`Tutorial`
Can language models learn from explanations in context?	EMNLP	2022	N/A	`DeepMind`
Interpreting Language Models with Contrastive Explanations	EMNLP	2022	N/A	``
Acquisition of Chess Knowledge in AlphaZero	PNAS	2022	N/A	`DeepMind` `GoogleBrain`
What the DAAM: Interpreting Stable Diffusion Using Cross Attention	arXiv	2022	Github	``
Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis	AISTATS	2022	N/A	``
Use-Case-Grounded Simulations for Explanation Evaluation	NeurIPS	2022	N/A	``
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective	arXiv	2022	N/A	``
What Makes a Good Explanation?: A Harmonized View of Properties of Explanations	arXiv	2022	N/A	``
NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights	AAAI	2022	Github	``
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations	AIES (AAAI)	2022	N/A	``
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Models	arXiv	2022	Github	``
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off	NuerIPS	2022	Github	`CBM`, `CEM`
Self-explaining deep models with logic rule reasoning	NeurIPS	2022	N/A	``
What You See is What You Classify: Black Box Attributions	NeurIPS	2022	N/A	``
Concept Activation Regions: A Generalized Framework For Concept-Based Explanations	NeurIPS	2022	N/A	``
What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods	NeurIPS	2022	N/A	``
Scalable Interpretability via Polynomials	NeurIPS	2022	N/A	``
Learning to Scaffold: Optimizing Model Explanations for Teaching	NeurIPS	2022	N/A	``
Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF	NeurIPS	2022	N/A	``
WeightedSHAP: analyzing and improving Shapley based feature attribution	NeurIPS	2022	N/A	``
Visual correspondence-based explanations improve AI robustness and human-AI team accuracy	NeurIPS	2022	N/A	``
VICE: Variational Interpretable Concept Embeddings	NeurIPS	2022	N/A	``
Robust Feature-Level Adversaries are Interpretability Tools	NeurIPS	2022	N/A	``
ProtoX: Explaining a Reinforcement Learning Agent via Prototyping	NeurIPS	2022	N/A	``
ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model	NeurIPS	2022	N/A	``
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability	NeurIPS	2022	N/A	``
Neural Basis Models for Interpretability	NeurIPS	2022	N/A	``
Implications of Model Indeterminacy for Explanations of Automated Decisions	NeurIPS	2022	N/A	``
Explainability Via Causal Self-Talk	NeurIPS	2022	N/A	`DeepMind`
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations	NeurIPS	2022	N/A	``
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	NeurIPS	2022	N/A	`GoogleBrain`
OpenXAI: Towards a Transparent Evaluation of Model Explanations	NeurIPS	2022	N/A	``
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations	NeurIPS	2022	N/A	``
Foundations of Symbolic Languages for Model Interpretability	NeurIPS	2022	N/A	``
The Utility of Explainable AI in Ad Hoc Human-Machine Teaming	NeurIPS	2022	N/A	``
Addressing Leakage in Concept Bottleneck Models	NeurIPS	2022	N/A	``
Interpreting Language Models with Contrastive Explanations	EMNLP	2022	N/A	``
Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models	EMNLP	2022	N/A	``
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations	EMNLP	2022	N/A	``
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure	EMNLP	2022	N/A	``
Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework	EMNLP	2022	N/A	``
Explainable Question Answering based on Semantic Graph by Global Differentiable Learning and Dynamic Adaptive Reasoning	EMNLP	2022	N/A	``
Faithful Knowledge Graph Explanations in Commonsense Question Answering	EMNLP	2022	N/A	``
Optimal Interpretable Clustering Using Oblique Decision Trees	KDD	2022	N/A	``
ExMeshCNN: An Explainable Convolutional Neural Network Architecture for 3D Shape Analysis	KDD	2022	N/A	``
Learning Differential Operators for Interpretable Time Series Modeling	KDD	2022	N/A	``
Compute Like Humans: Interpretable Step-by-step Symbolic Computation with Deep Neural Network	KDD	2022	N/A	``
Causal Attention for Interpretable and Generalizable Graph Classification	KDD	2022	N/A	``
Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction	KDD	2022	N/A	``
Label-Free Explainability for Unsupervised Models	ICML	2022	N/A	``
Rethinking Attention-Model Explainability through Faithfulness Violation Test	ICML	2022	N/A	``
Hierarchical Shrinkage: Improving the Accuracy and Interpretability of Tree-Based Methods	ICML	2022	N/A	``
A Functional Information Perspective on Model Interpretation	ICML	2022	N/A	``
Inducing Causal Structure for Interpretable Neural Networks	ICML	2022	N/A	``
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder	ICML	2022	N/A	``
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings	ICML	2022	N/A	``
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism	ICML	2022	N/A	``
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers	ICML	2022	N/A	``
Robust Models Are More Interpretable Because Attributions Look Normal	ICML	2022	N/A	``
Latent Diffusion Energy-Based Model for Interpretable Text Modelling	ICML	2022	N/A	``
Crowd, Expert & AI: A Human-AI Interactive Approach Towards Natural Language Explanation based COVID-19 Misinformation Detection	IJCAI	2022	N/A	``
AttExplainer: Explain Transformer via Attention by Reinforcement Learning	IJCAI	2022	N/A	``
Investigating and explaining the frequency bias in classification	IJCAI	2022	N/A	``
Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN	IJCAI	2022	N/A	``
Axiomatic Foundations of Explainability	IJCAI	2022	N/A	``
Explaining Soft-Goal Conflicts through Constraint Relaxations	IJCAI	2022	N/A	``
Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation	IJCAI	2022	N/A	``
Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering	IJCAI	2022	N/A	``
Toward Policy Explanations for Multi-Agent Reinforcement Learning	IJCAI	2022	N/A	``
“My nose is running.” “Are you also coughing?”: Building A Medical Diagnosis Agent with Interpretable Inquiry Logics	IJCAI	2022	N/A	``
Model Stealing Defense against Exploiting Information Leak Through the Interpretation of Deep Neural Nets	IJCAI	2022	N/A	``
Learning by Interpreting	IJCAI	2022	N/A	``
Using Constraint Programming and Graph Representation Learning for Generating Interpretable Cloud Security Policies	IJCAI	2022	N/A	``
Explanations for Negative Query Answers under Inconsistency-Tolerant Semantics	IJCAI	2022	N/A	``
On Preferred Abductive Explanations for Decision Trees and Random Forests	IJCAI	2022	N/A	``
Adversarial Explanations for Knowledge Graph Embeddings	IJCAI	2022	N/A	``
Looking Inside the Black-Box: Logic-based Explanations for Neural Networks	KR	2022	N/A	``
Entropy-Based Logic Explanations of Neural Networks	AAAI	2022	N/A	``
Explainable Neural Rule Learning	WWW	2022	N/A	``
Explainable Deep Learning: A Field Guide for the Uninitiated	JAIR	2022	N/A	``
			N/A	``

2023

Title	Venue	Year	Code	Keywords
On the Privacy Risks of Algorithmic Recourse	AISTATS	2023	N/A	``
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten	ICML	2023	N/A	``
Tracr: Compiled Transformers as a Laboratory for Interpretability	arXiv	2023	Github	`DeepMind`
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse	ICLR	2023	N/A	``
Concept-level Debugging of Part-Prototype Networks	ICLR	2023	N/A	``
Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning	ICLR	2023	N/A	``
Re-calibrating Feature Attributions for Model Interpretation	ICLR	2023	N/A	``
Post-hoc Concept Bottleneck Models	ICLR	2023	N/A	``
Quantifying Memorization Across Neural Language Models	ICLR	2023	N/A	``
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark	ICLR	2023	N/A	``
PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification	CVPR	2023	N/A	``
EVAL: Explainable Video Anomaly Localization	CVPR	2023	N/A	``
Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Learnability, and Human Capability	CVPR	2023	Github	``
Spatial-Temporal Concept Based Explanation of 3D ConvNets	CVPR	2023	Github	``
Adversarial Counterfactual Visual Explanations	CVPR	2023	N/A	``
Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification	CVPR	2023	N/A	``
Explaining Image Classifiers With Multiscale Directional Image Representation	CVPR	2023	N/A	``
CRAFT: Concept Recursive Activation FacTorization for Explainability	CVPR	2023	N/A	``
SketchXAI: A First Look at Explainability for Human Sketches	CVPR	2023	N/A	``
Don't Lie to Me! Robust and Efficient Explainability With Verified Perturbation Analysis	CVPR	2023	N/A	``
Gradient-Based Uncertainty Attribution for Explainable Bayesian Deep Learning	CVPR	2023	N/A	``
Learning Bottleneck Concepts in Image Classification	CVPR	2023	N/A	``
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification	CVPR	2023	N/A	``
Interpretable Neural-Symbolic Concept Reasoning	ICML	2023	Github
Identifying Interpretable Subspaces in Image Representations	ICML	2023	N/A	``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat	ICML	2023	N/A	``
Explainability as statistical inference	ICML	2023	N/A	``
On the Impact of Knowledge Distillation for Model Interpretability	ICML	2023	N/A	``
NA2Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning	ICML	2023	N/A	``
Explaining Reinforcement Learning with Shapley Values	ICML	2023	N/A	``
Explainable Data-Driven Optimization: From Context to Decision and Back Again	ICML	2023	N/A	``
Causal Proxy Models for Concept-based Model Explanations	ICML	2023	N/A	``
Learning Perturbations to Explain Time Series Predictions	ICML	2023	N/A	``
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching	ICML	2023	N/A	``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat	ICML	2023	Github	``
Representer Point Selection for Explaining Regularized High-dimensional Models	ICML	2023	N/A	``
Towards Explaining Distribution Shifts	ICML	2023	N/A	``
Relevant Walk Search for Explaining Graph Neural Networks	ICML	2023	Github	``
Concept-based Explanations for Out-of-Distribution Detectors	ICML	2023	N/A	``
GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations	ICML	2023	Github	``
Robust Explanation for Free or At the Cost of Faithfulness	ICML	2023	N/A	``
Learn to Accumulate Evidence from All Training Samples: Theory and Practice	ICML	2023	N/A	``
Towards Trustworthy Explanation: On Causal Rationalization	ICML	2023	N/A	``
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables	ICML	2023	N/A	``
Probabilistic Concept Bottleneck Models	ICML	2023	N/A	``
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective	ICML	2023	N/A	``
Towards credible visual model interpretation with path attribution	ICML	2023	N/A	``
Trainability, Expressivity and Interpretability in Gated Neural ODEs	ICML	2023	N/A	``
Discover and Cure: Concept-aware Mitigation of Spurious Correlation	ICML	2023	N/A	``
PWSHAP: A Path-Wise Explanation Model for Targeted Variables	ICML	2023	N/A	``
A Closer Look at the Intervention Procedure of Concept Bottleneck Models	ICML	2023	N/A	``
Counterfactual Analysis in Dynamic Latent-State Models	ICML	2023	N/A	``
Tackling Shortcut Learning in Deep Neural Networks: An Iterative Approach with Interpretable Models	ICML Workshop	2023	N/A	``
Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers	AAAI	2023	N/A	``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability	AAAI	2023	N/A	``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations	AAAI	2023	N/A	``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework	AAAI	2023	N/A	``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling	AAAI	2023	N/A	``
Learning Interpretable Temporal Properties from Positive Examples Only	AAAI	2023	N/A	``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions	AAAI	2023	N/A	``
Towards More Robust Interpretation via Local Gradient Alignment	AAAI	2023	N/A	``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network	AAAI	2023	N/A	``
XClusters: Explainability-First Clustering	AAAI	2023	N/A	``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis	AAAI	2023	N/A	``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations	AAAI	2023	N/A	``
Explaining Model Confidence Using Counterfactuals	AAAI	2023	N/A	``
SEAT: Stable and Explainable Attention	AAAI	2023	N/A	``
Factual and Informative Review Generation for Explainable Recommendation	AAAI	2023	N/A	``
Improving Interpretability via Explicit Word Interaction Graph Layer	AAAI	2023	N/A	``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing	AAAI	2023	N/A	``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations	AAAI	2023	N/A	``
Targeted Knowledge Infusion To Make Conversational AI Explainable and Safe	AAAI	2023	N/A	``
eForecaster: Unifying Electricity Forecasting with Robust, Flexible, and Explainable Machine Learning Algorithms	AAAI	2023	N/A	``
SolderNet: Towards Trustworthy Visual Inspection of Solder Joints in Electronics Manufacturing Using Explainable Artificial Intelligence	AAAI	2023	N/A	``
Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency	AAAI	2023	N/A	``
Ripple: Concept-Based Interpretation for Raw Time Series Models in Education	AAAI	2023	N/A	``
Semantics, Ontology and Explanation	arXiv	2023	N/A	`Ontological Unpacking`
Post Hoc Explanations of Language Models Can Improve Language Models	arXiv	2023	N/A	``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching	AAAI	2023	N/A	``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework	AAAI	2023	N/A	``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability	AAAI	2023	N/A	``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy	AAAI	2023	N/A	``
Multi-Aspect Explainable Inductive Relation Prediction by Sentence Transformer	AAAI	2023	N/A	``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling	AAAI	2023	N/A	``
Learning Interpretable Temporal Properties from Positive Examples Only	AAAI	2023	N/A	``
Unfooling Perturbation-Based Post Hoc Explainers	AAAI	2023	N/A	``
Very Fast, Approximate Counterfactual Explanations for Decision Forests	AAAI	2023	N/A	``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions	AAAI	2023	N/A	``
Towards More Robust Interpretation via Local Gradient Alignment	AAAI	2023	N/A	``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network	AAAI	2023	N/A	``
Local Explanations for Reinforcement Learning	AAAI	2023	N/A	``
ConceptX: A Framework for Latent Concept Analysis	AAAI	2023	N/A	``
XClusters: Explainability-First Clustering	AAAI	2023	N/A	``
Explaining Random Forests Using Bipolar Argumentation and Markov Networks	AAAI	2023	N/A	``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis	AAAI	2023	N/A	``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations	AAAI	2023	N/A	``
Explaining Model Confidence Using Counterfactuals	AAAI	2023	N/A	``
XRand: Differentially Private Defense against Explanation-Guided Attacks	AAAI	2023	N/A	``
Unsupervised Explanation Generation via Correct Instantiations	AAAI	2023	N/A	``
SEAT: Stable and Explainable Attention	AAAI	2023	N/A	``
Disentangled CVAEs with Contrastive Learning for Explainable Recommendation	AAAI	2023	N/A	``
Factual and Informative Review Generation for Explainable Recommendation	AAAI	2023	N/A	``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing	AAAI	2023	N/A	``
Improving Interpretability via Explicit Word Interaction Graph Layer	AAAI	2023	N/A	``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations	AAAI	2023	N/A	``
Interpretable Chirality-Aware Graph Neural Network for Quantitative Structure Activity Relationship Modeling in Drug Discovery	AAAI	2023	N/A	``
Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap	AAAI	2023	N/A	``
Interactive Concept Bottleneck Models	AAAI	2023	N/A	``
Data-Efficient and Interpretable Tabular Anomaly Detection	KDD	2023	N/A	``
Counterfactual Learning on Heterogeneous Graphs with Greedy Perturbation	KDD	2023	N/A	``
Hands-on Tutorial: "Explanations in AI: Methods, Stakeholders and Pitfalls"	KDD	2023	N/A	``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations	KDD	2023	N/A	``
Generative AI meets Responsible AI: Practical Challenges and Opportunities	KDD	2023	N/A	``
Empower Post-hoc Graph Explanations with Information Bottleneck: A Pre-training and Fine-tuning Perspective	KDD	2023	N/A	``
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation	KDD	2023	N/A	``
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations	KDD	2023	N/A	``
Fire: An Optimization Approach for Fast Interpretable Rule Extraction	KDD	2023	N/A	``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation	KDD	2023	N/A	``
A Causality Inspired Framework for Model Interpretation	KDD	2023	N/A	``
Path-Specific Counterfactual Fairness for Recommender Systems	KDD	2023	N/A	``
SURE: Robust, Explainable, and Fair Classification without Sensitive Attributes	KDD	2023	N/A	``
Learning for Counterfactual Fairness from Observational Data	KDD	2023	N/A	``
Interpretable Sparsification of Brain Graphs: Better Practices and Effective Designs for Graph Neural Networks	KDD	2023	N/A	``
ExplainableFold: Understanding AlphaFold Prediction with Explainable AI	KDD	2023	N/A	``
FLAMES2Graph: An Interpretable Federated Multivariate Time Series Classification Framework	KDD	2023	N/A	``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations	KDD	2023	N/A	``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation	KDD	2023	N/A	``
Counterfactual Explanations and Model Multiplicity: a Relational Verification View	Proceedings of KR	2023	N/A	``
Explainable Representations for Relation Prediction in Knowledge Graphs	Proceedings of KR	2023	N/A	``
Region-based Saliency Explanations on the Recognition of Facial Genetic Syndromes	PMLR	2023	N/A	``
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods	arXiv	2023	N/A	``
Diffusion-based Visual Counterfactual Explanations - Towards Systematic Quantitative Evaluation	arXiv	2023	N/A	``
Testing methods of neural systems understanding	Cognitive Systems Research	2023	N/A	``
Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning	arXiv	2023	N/A	``
An Explainable Federated Learning and Blockchain based Secure Credit Modeling Method	EJOR	2023	N/A	``
i-Align: an interpretable knowledge graph alignment model	DMKD	2023	N/A	``
Goodhart’s Law Applies to NLP’s Explanation Benchmarks	arXiv	2023	N/A	``
DELELSTM: DECOMPOSITION-BASED LINEAR EXPLAINABLE LSTM TO CAPTURE INSTANTANEOUS AND LONG-TERM EFFECTS IN TIME SERIES	arXiv	2023	N/A	``
BEYOND DISCRIMINATIVE REGIONS: SALIENCY MAPS AS ALTERNATIVES TO CAMS FOR WEAKLY SU- PERVISED SEMANTIC SEGMENTATION	arXiv	2023	N/A	``
SEA: Shareable and Explainable Attribution for Query-based Black-box Attacks	arXiv	2023	N/A	``
Sparse Linear Concept Discovery Models	arXiv	2023	N/A	``
Revisiting the Performance-Explainability Trade-Off in Explainable Artificial Intelligence (XAI)	arXiv	2023	N/A	``
KGTN: Knowledge Graph Transformer Network for explainable multi-category item recommendation	KBS	2023	N/A	``
SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems	arXiv	2023	N/A	``
Explainable Multi-Agent Reinforcement Learning for Temporal Queries	IJCAI	2023	N/A	``
Advancing Post-Hoc Case-Based Explanation with Feature Highlighting	IJCAI	2023	N/A	``
Explanation-Guided Reward Alignment	IJCAI	2023	N/A	``
FEAMOE: Fair, Explainable and Adaptive Mixture of Experts	IJCAI	2023	N/A	``
Statistically Significant Concept-based Explanation of Image Classifiers via Model Knockoffs	IJCAI	2023	N/A	``
Learning Prototype Classifiers for Long-Tailed Recognition	IJCAI	2023	N/A	``
On Translations between ML Models for XAI Purposes	IJCAI	2023	N/A	``
The Parameterized Complexity of Finding Concise Local Explanations	IJCAI	2023	N/A	``
Neuro-Symbolic Class Expression Learning	IJCAI	2023	N/A	``
A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering	IJCAI	2023	N/A	``
Cardinality-Minimal Explanations for Monotonic Neural Networks	IJCAI	2023	N/A	``
Unveiling Concepts Learned by a World-Class Chess-Playing Agent	IJCAI	2023	N/A	``
Explainable Text Classification via Attentive and Targeted Mixing Data Augmentation	IJCAI	2023	N/A	``
On the Complexity of Counterfactual Reasoning	IJCAI	2023	N/A	``
Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality (Extended Abstract)	IJCAI	2023	N/A	``
Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing	arXiv	2023	N/A	``
Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse	arXiv	2023	N/A	``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations	CIKM	2023	N/A	``
A Function Interpretation Benchmark for Evaluating Interpretability Methods	arXiv	2023	N/A	``
Explaining through Transformer Input Sampling	arXiv	2023	N/A	``
Backtracking Counterfactuals	CLeaR	2023	N/A	``
Text2Concept: Concept Activation Vectors Directly from Text	CVPR Workshop	2023	N/A	``
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation	arXiv	2023	N/A	``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance	NeurIPS	2023	Github	``
CLIP-DISSECT: AUTOMATIC DESCRIPTION OF NEU- RON REPRESENTATIONS IN DEEP VISION NETWORKS	ICLR	2023	Github	``
Label-free Concept Bottleneck Models	ICLR	2023	N/A	``
Concept-level Debugging of Part-Prototype Networks	ICLR	2023	N/A	``
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes	ICLR	2023	N/A	``
Re-calibrating Feature Attributions for Model Interpretation	ICLR	2023	N/A	``
Post-hoc Concept Bottleneck Models	ICLR	2023	N/A	``
Information Maximization Perspective of Orthogonal Matching Pursuit with Applications to Explainable AI	NeurIPS	2023	N/A	``
Explaining Predictive Uncertainty with Information Theoretic Shapley Values	NeurIPS	2023	N/A	``
REASONER: An Explainable Recommendation Dataset with Comprehensive Labeling Ground Truths	NeurIPS	2023	N/A	``
Explain Any Concept: Segment Anything Meets Concept-Based Explanation	NeurIPS	2023	N/A	``
VeriX: Towards Verified Explainability of Deep Neural Networks	NeurIPS	2023	N/A	``
Explainable and Efficient Randomized Voting Rules	NeurIPS	2023	N/A	``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery	NeurIPS	2023	N/A	``
Explaining the Uncertain: Stochastic Shapley Values for Gaussian Process Models	NeurIPS	2023	N/A	``
V-InFoR: A Robust Graph Neural Networks Explainer for Structurally Corrupted Graphs	NeurIPS	2023	N/A	``
Explainable Brain Age Prediction using coVariance Neural Networks	NeurIPS	2023	N/A	``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery	NeurIPS	2023	N/A	``
D4Explainer: In-distribution Explanations of Graph Neural Network via Discrete Denoising Diffusion	NeurIPS	2023	N/A	``
StateMask: Explaining Deep Reinforcement Learning through State Mask	NeurIPS	2023	N/A	``
LICO: Explainable Models with Language-Image COnsistency	NeurIPS	2023	N/A	``
On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective	NeurIPS	2023	N/A	``
Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction	NeurIPS	2023	N/A	``
Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability	NeurIPS	2023	N/A	``
Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks	NeurIPS	2023	N/A	``
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples	NeurIPS	2023	N/A	``
HiBug: On Human-Interpretable Model Debug	NeurIPS	2023	N/A	``
Towards Self-Interpretable Graph-Level Anomaly Detection	NeurIPS	2023	N/A	``
Interpretable Graph Networks Formulate Universal Algebra Conjectures	NeurIPS	2023	N/A	``
Towards Automated Circuit Discovery for Mechanistic Interpretabilit	NeurIPS	2023	N/A	``
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach	NeurIPS	2023	N/A	``
DISCOVER: Making Vision Networks Interpretable via Competition and Dissection	NeurIPS	2023	N/A	``
MultiMoDN—Multimodal, Multi-Task, Interpretable Modular Networks	NeurIPS	2023	N/A	``
Causal Interpretation of Self-Attention in Pre-Trained Transformers	NeurIPS	2023	N/A	``
Tracr: Compiled Transformers as a Laboratory for Interpretability	NeurIPS	2023	N/A	``
Learning Interpretable Low-dimensional Representation via Physical Symmetry	NeurIPS	2023	N/A	``
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models	NeurIPS	2023	N/A	``
Transitivity Recovering Decompositions: Interpretable and Robust Fine-Grained Relationships	NeurIPS	2023	N/A	``
GRAND-SLAMIN’ Interpretable Additive Modeling with Structural Constraints	NeurIPS	2023	N/A	``
Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction	NeurIPS	2023	N/A	``
GPEX, A Framework For Interpreting Artificial Neural Networks	NeurIPS	2023	N/A	``
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers	NeurIPS	2023	N/A	``
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP	NeurIPS	2023	N/A	``
On the Identifiability and Interpretability of Gaussian Process Models	NeurIPS	2023	N/A	``
BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis	NeurIPS	2023	N/A	``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance	NeurIPS	2023	N/A	``
Evaluating Neuron Interpretation Methods of NLP Models	NeurIPS	2023	N/A	``
FIND: A Function Description Benchmark for Evaluating Interpretability Methods	NeurIPS	2023	N/A	``
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model	NeurIPS	2023	N/A	``
Interpretable Prototype-based Graph Information Bottleneck	NeurIPS	2023	N/A	``
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca	NeurIPS	2023	N/A	``
M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models	NeurIPS	2023	N/A	``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning	EMNLP	2023	N/A	``
Towards Explainable and Accessible AI	EMNLP	2023	N/A	``
KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing	EMNLP	2023	N/A	``
INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback	EMNLP	2023	N/A	``
Goal-Driven Explainable Clustering via Language Descriptions	EMNLP	2023	N/A	``
VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights	EMNLP	2023	N/A	``
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation	EMNLP	2023	N/A	``
Hop, Union, Generate: Explainable Multi-hop Reasoning without Rationale Supervision	EMNLP	2023	N/A	``
GenEx: A Commonsense-aware Unified Generative Framework for Explainable Cyberbullying Detection	EMNLP	2023	N/A	``
DRGCoder: Explainable Clinical Coding for the Early Prediction of Diagnostic-Related Groups	EMNLP	2023	N/A	``
LLM4Vis: Explainable Visualization Recommendation using ChatGPT	EMNLP	2023	N/A	``
Harnessing LLMs for Temporal Data - A Study on Explainable Financial Time Series Forecasting	EMNLP	2023	N/A	``
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning	EMNLP	2023	N/A	``
Distilling ChatGPT for Explainable Automated Student Answer Assessment	EMNLP	2023	N/A	``
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models	EMNLP	2023	N/A	``
Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning	EMNLP	2023	N/A	``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning	EMNLP	2023	N/A	``
Deep Integrated Explanations	CIKM	2023	N/A	``
KG4Ex: An Explainable Knowledge Graph-Based Approach for Exercise Recommendation	CIKM	2023	N/A	``
Interpretable Fake News Detection with Graph Evidence	CIKM	2023	N/A	``
PriSHAP: Prior-guided Shapley Value Explanations for Correlated Features	CIKM	2023	N/A	``
A Model-Agnostic Method to Interpret Link Prediction Evaluation of Knowledge Graph Embeddings	CIKM	2023	N/A	``
ACGAN-GNNExplainer: Auxiliary Conditional Generative Explainer for Graph Neural Networks	CIKM	2023	N/A	``
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries	CIKM	2023	N/A	``
Explainable Spatio-Temporal Graph Neural Networks	CIKM	2023	N/A	``
Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction	CIKM	2023	N/A	``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations	CIKM	2023	N/A	``
NOVO: Learnable and Interpretable Document Identifiers for Model-Based IR	CIKM	2023	N/A	``
Counterfactual Monotonic Knowledge Tracing for Assessing Students' Dynamic Mastery of Knowledge Concepts	CIKM	2023	N/A	``
Contrastive Counterfactual Learning for Causality-aware Interpretable Recommender Systems	CIKM	2023	N/A	``
		2023	N/A	``

2024

Title	Venue	Year	Code	Keywords
Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models	AAAI	2024	N/A	``
Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference	AAAI	2024	N/A	``
On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods	AAAI	2024	N/A	``
A Framework for Data-Driven Explainability in Mathematical Optimization	AAAI	2024	N/A	``
Q-SENN: Quantized Self-Explaining Neural Networks	AAAI	2024	N/A	``
LR-XFL: Logical Reasoning-Based Explainable Federated Learning	AAAI	2024	N/A	``
Trade-Offs in Fine-Tuned Diffusion Models between Accuracy and Interpretability	AAAI	2024	N/A	``
π-Light: Programmatic Interpretable Reinforcement Learning for Resource-Limited Traffic Signal Control	AAAI	2024	N/A	``
Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations	AAAI	2024	N/A	``
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention	AAAI	2024	N/A	``
LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack	AAAI	2024	N/A	``
Learning Robust Rationales for Model Explainability: A Guidance-Based Approach	AAAI	2024	N/A	``
Explaining Generalization Power of a DNN Using Interactive Concepts	AAAI	2024	N/A	``
Federated Causality Learning with Explainable Adaptive Optimizatio	AAAI	2024	N/A	``
Learning Performance Maximizing Ensembles with Explainability Guarantees	AAAI	2024	N/A	``
Towards Modeling Uncertainties of Self-Explaining Neural Networks via Conformal Prediction	AAAI	2024	N/A	``
Towards Learning and Explaining Indirect Causal Effects in Neural Networks	AAAI	2024	N/A	``
GINN-LP: A Growing Interpretable Neural Network for Discovering Multivariate Laurent Polynomial Equations	AAAI	2024	N/A	``
Pantypes: Diverse Representatives for Self-Explainable Models	AAAI	2024	N/A	``
Factorized Explainer for Graph Neural Networks	AAAI	2024	N/A	``
Self-Interpretable Graph Learning with Sufficient and Necessary Explanations	AAAI	2024	N/A	``
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning	AAAI	2024	N/A	``
A General Theoretical Framework for Learning Smallest Interpretable Models	AAAI	2024	N/A	``
Knowledge-Aware Explainable Reciprocal Recommendation	AAAI	2024	N/A	``
Fine-Tuning Large Language Model Based Explainable Recommendation with Explainable Quality Reward	AAAI	2024	N/A	``
Finding Interpretable Class-Specific Patterns through Efficient Neural Search	AAAI	2024	N/A	``
Enhance Sketch Recognition’s Explainability via Semantic Component-Level Parsing	AAAI	2024	N/A	``
B-spine: Learning B-spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation	AAAI	2024	N/A	``
A Convolutional Neural Network Interpretable Framework for Human Ventral Visual Pathway Representation	AAAI	2024	N/A	``
NeSyFOLD: A Framework for Interpretable Image Classification	AAAI	2024	N/A	``
Knowledge-Aware Neuron Interpretation for Scene Classification	AAAI	2024	N/A	``
MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment	AAAI	2024	N/A	``
Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds	AAAI	2024	N/A	``
Learning Audio Concepts from Counterfactual Natural Language	ICASSP	2024	N/A	``
		2024	N/A	``

rushrukh / awesome-explainable-ai Goto Github PK

awesome-explainable-ai's Introduction

Recent Publications in Explainable AI

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

awesome-explainable-ai's People

Contributors

Stargazers

Watchers

Forkers

awesome-explainable-ai's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs