GithubHelp home page GithubHelp logo

awesome-explainable-ai's Introduction

Recent Publications in Explainable AI

A repository containing recent explainable AI/Interpretable ML approaches

2015

Title Venue Year Code Keywords Summary
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission KDD 2015 N/A ``
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model arXiv 2015 N/A ``

2016

Title Venue Year Code Keywords Summary
Interpretable Decision Sets: A Joint Framework for Description and Prediction KDD 2016 N/A ``
"Why Should I Trust You?": Explaining the Predictions of Any Classifier KDD 2016 N/A ``
Towards A Rigorous Science of Interpretable Machine Learning arXiv 2017 N/A Review Paper

2017

Title Venue Year Code Keywords Summary
Transparency: Motivations and Challenges arXiv 2017 N/A Review Paper
A Unified Approach to Interpreting Model Predictions NeurIPS 2017 N/A ``
SmoothGrad: removing noise by adding noise ICML (Workshop) 2017 Github ``
Axiomatic Attribution for Deep Networks ICML 2017 N/A ``
Learning Important Features Through Propagating Activation Differences ICML 2017 N/A ``
Understanding Black-box Predictions via Influence Functions ICML 2017 N/A ``
Network Dissection: Quantifying Interpretability of Deep Visual Representations CVPR 2017 N/A ``

2018

Title Venue Year Code Keywords Summary
Explainable Prediction of Medical Codes from Clinical Text ACL 2018 N/A ``
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) ICML 2018 N/A ``
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR HJTL 2018 N/A ``
Sanity Checks for Saliency Maps NeruIPS 2018 N/A ``
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions AAAI 2018 N/A ``
The Mythos of Model Interpretability arXiv 2018 N/A Review Paper
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead Nature Machine Intelligence 2018 N/A ``

2019

Title Venue Year Code Keywords Summary
Human Evaluation of Models Built for Interpretability AAAI 2019 N/A Human in the loop
Data Shapley: Equitable Valuation of Data for Machine Learning ICML 2019 N/A ``
Attention is not Explanation ACL 2019 N/A ``
Actionable Recourse in Linear Classification FAccT 2019 N/A ``
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead Nature 2019 N/A ``
Explanations can be manipulated and geometry is to blame NeurIPS 2019 N/A ``
Learning Optimized Risk Scores JMLR 2019 N/A ``
Explain Yourself! Leveraging Language Models for Commonsense Reasoning ACL 2019 N/A ``
Deep Neural Networks Constrained by Decision Rules AAAI 2018 N/A ``
Towards Automatic Concept-based Explanations NeurIPS 2019 Github ``

2020

Title Venue Year Code Keywords Summary
Interpreting the Latent Space of GANs for Semantic Face Editing CVPR 2020 N/A ``
GANSpace: Discovering Interpretable GAN Controls NeurIPS 2020 N/A ``
Explainability for fair machine learning arXiv 2020 N/A ``
An Introduction to Circuits Distill 2020 N/A Tutorial
Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses NeurIPS 2020 N/A ``
Learning Model-Agnostic Counterfactual Explanations for Tabular Data WWW 2020 N/A ``
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods AIES (AAAI) 2020 N/A ``
Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning CHI 2020 N/A Review Paper
Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs arXiv 2020 N/A Review Paper
Human-Driven FOL Explanations of Deep Learning IJCAI 2020 N\A 'Logic Explanations'
A Constraint-Based Approach to Learning and Explanation AAAI 2020 N\A 'Mutual Information'

2021

Title Venue Year Code Keywords Summary
A Learning Theoretic Perspective on Local Explainability ICLR (Poster) 2021 N/A ``
A Learning Theoretic Perspective on Local Explainability ICLR 2021 N/A ``
Do Input Gradients Highlight Discriminative Features? NeurIPS 2021 N/A ``
Explaining by Removing: A Unified Framework for Model Explanation JMLR 2021 N/A ``
Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience PACMHCI 2021 N/A ``
Towards Robust and Reliable Algorithmic Recourse NeurIPS 2021 N/A ``
A Framework to Learn with Interpretation NeurIPS 2021 N/A ``
Algorithmic Recourse: from Counterfactual Explanations to Interventions FAccT 2021 N/A ``
Manipulating and Measuring Model Interpretability CHI 2021 N/A ``
Explainable Reinforcement Learning via Model Transforms NeurIPS 2021 N/A ``
Aligning Artificial Neural Networks and Ontologies towards Explainable AI AAAI 2021 N/A ``

2022

Title Venue Year Code Keywords Summary
GlanceNets: Interpretabile, Leak-proof Concept-based Models CRL 2022 N/A ``
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases Transformer Circuit Thread 2022 N/A Tutorial
Can language models learn from explanations in context? EMNLP 2022 N/A DeepMind
Interpreting Language Models with Contrastive Explanations EMNLP 2022 N/A ``
Acquisition of Chess Knowledge in AlphaZero PNAS 2022 N/A DeepMind GoogleBrain
What the DAAM: Interpreting Stable Diffusion Using Cross Attention arXiv 2022 Github ``
Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis AISTATS 2022 N/A ``
Use-Case-Grounded Simulations for Explanation Evaluation NeurIPS 2022 N/A ``
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective arXiv 2022 N/A ``
What Makes a Good Explanation?: A Harmonized View of Properties of Explanations arXiv 2022 N/A ``
NoiseGrad — Enhancing Explanations by Introducing Stochasticity to Model Weights AAAI 2022 Github ``
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations AIES (AAAI) 2022 N/A ``
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Models arXiv 2022 Github ``
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off NuerIPS 2022 Github CBM, CEM
Self-explaining deep models with logic rule reasoning NeurIPS 2022 N/A ``
What You See is What You Classify: Black Box Attributions NeurIPS 2022 N/A ``
Concept Activation Regions: A Generalized Framework For Concept-Based Explanations NeurIPS 2022 N/A ``
What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods NeurIPS 2022 N/A ``
Scalable Interpretability via Polynomials NeurIPS 2022 N/A ``
Learning to Scaffold: Optimizing Model Explanations for Teaching NeurIPS 2022 N/A ``
Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF NeurIPS 2022 N/A ``
WeightedSHAP: analyzing and improving Shapley based feature attribution NeurIPS 2022 N/A ``
Visual correspondence-based explanations improve AI robustness and human-AI team accuracy NeurIPS 2022 N/A ``
VICE: Variational Interpretable Concept Embeddings NeurIPS 2022 N/A ``
Robust Feature-Level Adversaries are Interpretability Tools NeurIPS 2022 N/A ``
ProtoX: Explaining a Reinforcement Learning Agent via Prototyping NeurIPS 2022 N/A ``
ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model NeurIPS 2022 N/A ``
Where do Models go Wrong? Parameter-Space Saliency Maps for Explainability NeurIPS 2022 N/A ``
Neural Basis Models for Interpretability NeurIPS 2022 N/A ``
Implications of Model Indeterminacy for Explanations of Automated Decisions NeurIPS 2022 N/A ``
Explainability Via Causal Self-Talk NeurIPS 2022 N/A DeepMind
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations NeurIPS 2022 N/A ``
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models NeurIPS 2022 N/A GoogleBrain
OpenXAI: Towards a Transparent Evaluation of Model Explanations NeurIPS 2022 N/A ``
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations NeurIPS 2022 N/A ``
Foundations of Symbolic Languages for Model Interpretability NeurIPS 2022 N/A ``
The Utility of Explainable AI in Ad Hoc Human-Machine Teaming NeurIPS 2022 N/A ``
Addressing Leakage in Concept Bottleneck Models NeurIPS 2022 N/A ``
Interpreting Language Models with Contrastive Explanations EMNLP 2022 N/A ``
Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models EMNLP 2022 N/A ``
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations EMNLP 2022 N/A ``
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure EMNLP 2022 N/A ``
Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework EMNLP 2022 N/A ``
Explainable Question Answering based on Semantic Graph by Global Differentiable Learning and Dynamic Adaptive Reasoning EMNLP 2022 N/A ``
Faithful Knowledge Graph Explanations in Commonsense Question Answering EMNLP 2022 N/A ``
Optimal Interpretable Clustering Using Oblique Decision Trees KDD 2022 N/A ``
ExMeshCNN: An Explainable Convolutional Neural Network Architecture for 3D Shape Analysis KDD 2022 N/A ``
Learning Differential Operators for Interpretable Time Series Modeling KDD 2022 N/A ``
Compute Like Humans: Interpretable Step-by-step Symbolic Computation with Deep Neural Network KDD 2022 N/A ``
Causal Attention for Interpretable and Generalizable Graph Classification KDD 2022 N/A ``
Group-wise Reinforcement Feature Generation for Optimal and Explainable Representation Space Reconstruction KDD 2022 N/A ``
Label-Free Explainability for Unsupervised Models ICML 2022 N/A ``
Rethinking Attention-Model Explainability through Faithfulness Violation Test ICML 2022 N/A ``
Hierarchical Shrinkage: Improving the Accuracy and Interpretability of Tree-Based Methods ICML 2022 N/A ``
A Functional Information Perspective on Model Interpretation ICML 2022 N/A ``
Inducing Causal Structure for Interpretable Neural Networks ICML 2022 N/A ``
ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder ICML 2022 N/A ``
Interpretable Neural Networks with Frank-Wolfe: Sparse Relevance Maps and Relevance Orderings ICML 2022 N/A ``
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism ICML 2022 N/A ``
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers ICML 2022 N/A ``
Robust Models Are More Interpretable Because Attributions Look Normal ICML 2022 N/A ``
Latent Diffusion Energy-Based Model for Interpretable Text Modelling ICML 2022 N/A ``
Crowd, Expert & AI: A Human-AI Interactive Approach Towards Natural Language Explanation based COVID-19 Misinformation Detection IJCAI 2022 N/A ``
AttExplainer: Explain Transformer via Attention by Reinforcement Learning IJCAI 2022 N/A ``
Investigating and explaining the frequency bias in classification IJCAI 2022 N/A ``
Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN IJCAI 2022 N/A ``
Axiomatic Foundations of Explainability IJCAI 2022 N/A ``
Explaining Soft-Goal Conflicts through Constraint Relaxations IJCAI 2022 N/A ``
Robust Interpretable Text Classification against Spurious Correlations Using AND-rules with Negation IJCAI 2022 N/A ``
Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering IJCAI 2022 N/A ``
Toward Policy Explanations for Multi-Agent Reinforcement Learning IJCAI 2022 N/A ``
“My nose is running.” “Are you also coughing?”: Building A Medical Diagnosis Agent with Interpretable Inquiry Logics IJCAI 2022 N/A ``
Model Stealing Defense against Exploiting Information Leak Through the Interpretation of Deep Neural Nets IJCAI 2022 N/A ``
Learning by Interpreting IJCAI 2022 N/A ``
Using Constraint Programming and Graph Representation Learning for Generating Interpretable Cloud Security Policies IJCAI 2022 N/A ``
Explanations for Negative Query Answers under Inconsistency-Tolerant Semantics IJCAI 2022 N/A ``
On Preferred Abductive Explanations for Decision Trees and Random Forests IJCAI 2022 N/A ``
Adversarial Explanations for Knowledge Graph Embeddings IJCAI 2022 N/A ``
Looking Inside the Black-Box: Logic-based Explanations for Neural Networks KR 2022 N/A ``
Entropy-Based Logic Explanations of Neural Networks AAAI 2022 N/A ``
Explainable Neural Rule Learning WWW 2022 N/A ``
Explainable Deep Learning: A Field Guide for the Uninitiated JAIR 2022 N/A ``
N/A ``

2023

Title Venue Year Code Keywords Summary
On the Privacy Risks of Algorithmic Recourse AISTATS 2023 N/A ``
Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten ICML 2023 N/A ``
Tracr: Compiled Transformers as a Laboratory for Interpretability arXiv 2023 Github DeepMind
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse ICLR 2023 N/A ``
Concept-level Debugging of Part-Prototype Networks ICLR 2023 N/A ``
Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning ICLR 2023 N/A ``
Re-calibrating Feature Attributions for Model Interpretation ICLR 2023 N/A ``
Post-hoc Concept Bottleneck Models ICLR 2023 N/A ``
Quantifying Memorization Across Neural Language Models ICLR 2023 N/A ``
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark ICLR 2023 N/A ``
PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification CVPR 2023 N/A ``
EVAL: Explainable Video Anomaly Localization CVPR 2023 N/A ``
Overlooked Factors in Concept-based Explanations: Dataset Choice, Concept Learnability, and Human Capability CVPR 2023 Github ``
Spatial-Temporal Concept Based Explanation of 3D ConvNets CVPR 2023 Github ``
Adversarial Counterfactual Visual Explanations CVPR 2023 N/A ``
Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification CVPR 2023 N/A ``
Explaining Image Classifiers With Multiscale Directional Image Representation CVPR 2023 N/A ``
CRAFT: Concept Recursive Activation FacTorization for Explainability CVPR 2023 N/A ``
SketchXAI: A First Look at Explainability for Human Sketches CVPR 2023 N/A ``
Don't Lie to Me! Robust and Efficient Explainability With Verified Perturbation Analysis CVPR 2023 N/A ``
Gradient-Based Uncertainty Attribution for Explainable Bayesian Deep Learning CVPR 2023 N/A ``
Learning Bottleneck Concepts in Image Classification CVPR 2023 N/A ``
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification CVPR 2023 N/A ``
Interpretable Neural-Symbolic Concept Reasoning ICML 2023 Github
Identifying Interpretable Subspaces in Image Representations ICML 2023 N/A ``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat ICML 2023 N/A ``
Explainability as statistical inference ICML 2023 N/A ``
On the Impact of Knowledge Distillation for Model Interpretability ICML 2023 N/A ``
NA2Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning ICML 2023 N/A ``
Explaining Reinforcement Learning with Shapley Values ICML 2023 N/A ``
Explainable Data-Driven Optimization: From Context to Decision and Back Again ICML 2023 N/A ``
Causal Proxy Models for Concept-based Model Explanations ICML 2023 N/A ``
Learning Perturbations to Explain Time Series Predictions ICML 2023 N/A ``
Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching ICML 2023 N/A ``
Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat ICML 2023 Github ``
Representer Point Selection for Explaining Regularized High-dimensional Models ICML 2023 N/A ``
Towards Explaining Distribution Shifts ICML 2023 N/A ``
Relevant Walk Search for Explaining Graph Neural Networks ICML 2023 Github ``
Concept-based Explanations for Out-of-Distribution Detectors ICML 2023 N/A ``
GLOBE-CE: A Translation Based Approach for Global Counterfactual Explanations ICML 2023 Github ``
Robust Explanation for Free or At the Cost of Faithfulness ICML 2023 N/A ``
Learn to Accumulate Evidence from All Training Samples: Theory and Practice ICML 2023 N/A ``
Towards Trustworthy Explanation: On Causal Rationalization ICML 2023 N/A ``
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables ICML 2023 N/A ``
Probabilistic Concept Bottleneck Models ICML 2023 N/A ``
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective ICML 2023 N/A ``
Towards credible visual model interpretation with path attribution ICML 2023 N/A ``
Trainability, Expressivity and Interpretability in Gated Neural ODEs ICML 2023 N/A ``
Discover and Cure: Concept-aware Mitigation of Spurious Correlation ICML 2023 N/A ``
PWSHAP: A Path-Wise Explanation Model for Targeted Variables ICML 2023 N/A ``
A Closer Look at the Intervention Procedure of Concept Bottleneck Models ICML 2023 N/A ``
Counterfactual Analysis in Dynamic Latent-State Models ICML 2023 N/A ``
Tackling Shortcut Learning in Deep Neural Networks: An Iterative Approach with Interpretable Models ICML Workshop 2023 N/A ``
Rethinking Interpretation: Input-Agnostic Saliency Mapping of Deep Visual Classifiers AAAI 2023 N/A ``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching AAAI 2023 N/A ``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy AAAI 2023 N/A ``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability AAAI 2023 N/A ``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations AAAI 2023 N/A ``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework AAAI 2023 N/A ``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling AAAI 2023 N/A ``
Learning Interpretable Temporal Properties from Positive Examples Only AAAI 2023 N/A ``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions AAAI 2023 N/A ``
Towards More Robust Interpretation via Local Gradient Alignment AAAI 2023 N/A ``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network AAAI 2023 N/A ``
XClusters: Explainability-First Clustering AAAI 2023 N/A ``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis AAAI 2023 N/A ``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations AAAI 2023 N/A ``
Explaining Model Confidence Using Counterfactuals AAAI 2023 N/A ``
SEAT: Stable and Explainable Attention AAAI 2023 N/A ``
Factual and Informative Review Generation for Explainable Recommendation AAAI 2023 N/A ``
Improving Interpretability via Explicit Word Interaction Graph Layer AAAI 2023 N/A ``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing AAAI 2023 N/A ``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations AAAI 2023 N/A ``
Targeted Knowledge Infusion To Make Conversational AI Explainable and Safe AAAI 2023 N/A ``
eForecaster: Unifying Electricity Forecasting with Robust, Flexible, and Explainable Machine Learning Algorithms AAAI 2023 N/A ``
SolderNet: Towards Trustworthy Visual Inspection of Solder Joints in Electronics Manufacturing Using Explainable Artificial Intelligence AAAI 2023 N/A ``
Xaitk-Saliency: An Open Source Explainable AI Toolkit for Saliency AAAI 2023 N/A ``
Ripple: Concept-Based Interpretation for Raw Time Series Models in Education AAAI 2023 N/A ``
Semantics, Ontology and Explanation arXiv 2023 N/A Ontological Unpacking
Post Hoc Explanations of Language Models Can Improve Language Models arXiv 2023 N/A ``
TopicFM: Robust and Interpretable Topic-Assisted Feature Matching AAAI 2023 N/A ``
Beyond Graph Convolutional Network: An Interpretable Regularizer-Centered Optimization Framework AAAI 2023 N/A ``
KerPrint: Local-Global Knowledge Graph Enhanced Diagnosis Prediction for Retrospective and Prospective Interpretations AAAI 2023 N/A ``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy AAAI 2023 N/A ``
PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability AAAI 2023 N/A ``
Solving Explainability Queries with Quantification: The Case of Feature Relevancy AAAI 2023 N/A ``
Multi-Aspect Explainable Inductive Relation Prediction by Sentence Transformer AAAI 2023 N/A ``
Learning to Select Prototypical Parts for Interpretable Sequential Data Modeling AAAI 2023 N/A ``
Learning Interpretable Temporal Properties from Positive Examples Only AAAI 2023 N/A ``
Unfooling Perturbation-Based Post Hoc Explainers AAAI 2023 N/A ``
Very Fast, Approximate Counterfactual Explanations for Decision Forests AAAI 2023 N/A ``
Symbolic Metamodels for Interpreting Black-Boxes Using Primitive Functions AAAI 2023 N/A ``
Towards More Robust Interpretation via Local Gradient Alignment AAAI 2023 N/A ``
Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network AAAI 2023 N/A ``
Local Explanations for Reinforcement Learning AAAI 2023 N/A ``
ConceptX: A Framework for Latent Concept Analysis AAAI 2023 N/A ``
XClusters: Explainability-First Clustering AAAI 2023 N/A ``
Explaining Random Forests Using Bipolar Argumentation and Markov Networks AAAI 2023 N/A ``
Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis AAAI 2023 N/A ``
Fairness and Explainability: Bridging the Gap towards Fair Model Explanations AAAI 2023 N/A ``
Explaining Model Confidence Using Counterfactuals AAAI 2023 N/A ``
XRand: Differentially Private Defense against Explanation-Guided Attacks AAAI 2023 N/A ``
Unsupervised Explanation Generation via Correct Instantiations AAAI 2023 N/A ``
SEAT: Stable and Explainable Attention AAAI 2023 N/A ``
Disentangled CVAEs with Contrastive Learning for Explainable Recommendation AAAI 2023 N/A ``
Factual and Informative Review Generation for Explainable Recommendation AAAI 2023 N/A ``
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing AAAI 2023 N/A ``
Improving Interpretability via Explicit Word Interaction Graph Layer AAAI 2023 N/A ``
Improving Interpretability of Deep Sequential Knowledge Tracing Models with Question-centric Cognitive Representations AAAI 2023 N/A ``
Interpretable Chirality-Aware Graph Neural Network for Quantitative Structure Activity Relationship Modeling in Drug Discovery AAAI 2023 N/A ``
Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap AAAI 2023 N/A ``
Interactive Concept Bottleneck Models AAAI 2023 N/A ``
Data-Efficient and Interpretable Tabular Anomaly Detection KDD 2023 N/A ``
Counterfactual Learning on Heterogeneous Graphs with Greedy Perturbation KDD 2023 N/A ``
Hands-on Tutorial: "Explanations in AI: Methods, Stakeholders and Pitfalls" KDD 2023 N/A ``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations KDD 2023 N/A ``
Generative AI meets Responsible AI: Practical Challenges and Opportunities KDD 2023 N/A ``
Empower Post-hoc Graph Explanations with Information Bottleneck: A Pre-training and Fine-tuning Perspective KDD 2023 N/A ``
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation KDD 2023 N/A ``
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations KDD 2023 N/A ``
Fire: An Optimization Approach for Fast Interpretable Rule Extraction KDD 2023 N/A ``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation KDD 2023 N/A ``
A Causality Inspired Framework for Model Interpretation KDD 2023 N/A ``
Path-Specific Counterfactual Fairness for Recommender Systems KDD 2023 N/A ``
SURE: Robust, Explainable, and Fair Classification without Sensitive Attributes KDD 2023 N/A ``
Learning for Counterfactual Fairness from Observational Data KDD 2023 N/A ``
Interpretable Sparsification of Brain Graphs: Better Practices and Effective Designs for Graph Neural Networks KDD 2023 N/A ``
ExplainableFold: Understanding AlphaFold Prediction with Explainable AI KDD 2023 N/A ``
FLAMES2Graph: An Interpretable Federated Multivariate Time Series Classification Framework KDD 2023 N/A ``
Feature-based Learning for Diverse and Privacy-Preserving Counterfactual Explanations KDD 2023 N/A ``
ESSA: Explanation Iterative Supervision via Saliency-guided Data Augmentation KDD 2023 N/A ``
Counterfactual Explanations and Model Multiplicity: a Relational Verification View Proceedings of KR 2023 N/A ``
Explainable Representations for Relation Prediction in Knowledge Graphs Proceedings of KR 2023 N/A ``
Region-based Saliency Explanations on the Recognition of Facial Genetic Syndromes PMLR 2023 N/A ``
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods arXiv 2023 N/A ``
Diffusion-based Visual Counterfactual Explanations - Towards Systematic Quantitative Evaluation arXiv 2023 N/A ``
Testing methods of neural systems understanding Cognitive Systems Research 2023 N/A ``
Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning arXiv 2023 N/A ``
An Explainable Federated Learning and Blockchain based Secure Credit Modeling Method EJOR 2023 N/A ``
i-Align: an interpretable knowledge graph alignment model DMKD 2023 N/A ``
Goodhart’s Law Applies to NLP’s Explanation Benchmarks arXiv 2023 N/A ``
DELELSTM: DECOMPOSITION-BASED LINEAR EXPLAINABLE LSTM TO CAPTURE INSTANTANEOUS AND LONG-TERM EFFECTS IN TIME SERIES arXiv 2023 N/A ``
BEYOND DISCRIMINATIVE REGIONS: SALIENCY MAPS AS ALTERNATIVES TO CAMS FOR WEAKLY SU- PERVISED SEMANTIC SEGMENTATION arXiv 2023 N/A ``
SEA: Shareable and Explainable Attribution for Query-based Black-box Attacks arXiv 2023 N/A ``
Sparse Linear Concept Discovery Models arXiv 2023 N/A ``
Revisiting the Performance-Explainability Trade-Off in Explainable Artificial Intelligence (XAI) arXiv 2023 N/A ``
KGTN: Knowledge Graph Transformer Network for explainable multi-category item recommendation KBS 2023 N/A ``
SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems arXiv 2023 N/A ``
Explainable Multi-Agent Reinforcement Learning for Temporal Queries IJCAI 2023 N/A ``
Advancing Post-Hoc Case-Based Explanation with Feature Highlighting IJCAI 2023 N/A ``
Explanation-Guided Reward Alignment IJCAI 2023 N/A ``
FEAMOE: Fair, Explainable and Adaptive Mixture of Experts IJCAI 2023 N/A ``
Statistically Significant Concept-based Explanation of Image Classifiers via Model Knockoffs IJCAI 2023 N/A ``
Learning Prototype Classifiers for Long-Tailed Recognition IJCAI 2023 N/A ``
On Translations between ML Models for XAI Purposes IJCAI 2023 N/A ``
The Parameterized Complexity of Finding Concise Local Explanations IJCAI 2023 N/A ``
Neuro-Symbolic Class Expression Learning IJCAI 2023 N/A ``
A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering IJCAI 2023 N/A ``
Cardinality-Minimal Explanations for Monotonic Neural Networks IJCAI 2023 N/A ``
Unveiling Concepts Learned by a World-Class Chess-Playing Agent IJCAI 2023 N/A ``
Explainable Text Classification via Attentive and Targeted Mixing Data Augmentation IJCAI 2023 N/A ``
On the Complexity of Counterfactual Reasoning IJCAI 2023 N/A ``
Interpretable Local Concept-based Explanation with Human Feedback to Predict All-cause Mortality (Extended Abstract) IJCAI 2023 N/A ``
Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing arXiv 2023 N/A ``
Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse arXiv 2023 N/A ``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations CIKM 2023 N/A ``
A Function Interpretation Benchmark for Evaluating Interpretability Methods arXiv 2023 N/A ``
Explaining through Transformer Input Sampling arXiv 2023 N/A ``
Backtracking Counterfactuals CLeaR 2023 N/A ``
Text2Concept: Concept Activation Vectors Directly from Text CVPR Workshop 2023 N/A ``
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation arXiv 2023 N/A ``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance NeurIPS 2023 Github ``
CLIP-DISSECT: AUTOMATIC DESCRIPTION OF NEU- RON REPRESENTATIONS IN DEEP VISION NETWORKS ICLR 2023 Github ``
Label-free Concept Bottleneck Models ICLR 2023 N/A ``
Concept-level Debugging of Part-Prototype Networks ICLR 2023 N/A ``
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes ICLR 2023 N/A ``
Re-calibrating Feature Attributions for Model Interpretation ICLR 2023 N/A ``
Post-hoc Concept Bottleneck Models ICLR 2023 N/A ``
Information Maximization Perspective of Orthogonal Matching Pursuit with Applications to Explainable AI NeurIPS 2023 N/A ``
Explaining Predictive Uncertainty with Information Theoretic Shapley Values NeurIPS 2023 N/A ``
REASONER: An Explainable Recommendation Dataset with Comprehensive Labeling Ground Truths NeurIPS 2023 N/A ``
Explain Any Concept: Segment Anything Meets Concept-Based Explanation NeurIPS 2023 N/A ``
VeriX: Towards Verified Explainability of Deep Neural Networks NeurIPS 2023 N/A ``
Explainable and Efficient Randomized Voting Rules NeurIPS 2023 N/A ``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery NeurIPS 2023 N/A ``
Explaining the Uncertain: Stochastic Shapley Values for Gaussian Process Models NeurIPS 2023 N/A ``
V-InFoR: A Robust Graph Neural Networks Explainer for Structurally Corrupted Graphs NeurIPS 2023 N/A ``
Explainable Brain Age Prediction using coVariance Neural Networks NeurIPS 2023 N/A ``
TempME: Towards the Explainability of Temporal Graph Neural Networks via Motif Discovery NeurIPS 2023 N/A ``
D4Explainer: In-distribution Explanations of Graph Neural Network via Discrete Denoising Diffusion NeurIPS 2023 N/A ``
StateMask: Explaining Deep Reinforcement Learning through State Mask NeurIPS 2023 N/A ``
LICO: Explainable Models with Language-Image COnsistency NeurIPS 2023 N/A ``
On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective NeurIPS 2023 N/A ``
Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction NeurIPS 2023 N/A ``
Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability NeurIPS 2023 N/A ``
Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks NeurIPS 2023 N/A ``
Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples NeurIPS 2023 N/A ``
HiBug: On Human-Interpretable Model Debug NeurIPS 2023 N/A ``
Towards Self-Interpretable Graph-Level Anomaly Detection NeurIPS 2023 N/A ``
Interpretable Graph Networks Formulate Universal Algebra Conjectures NeurIPS 2023 N/A ``
Towards Automated Circuit Discovery for Mechanistic Interpretabilit NeurIPS 2023 N/A ``
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach NeurIPS 2023 N/A ``
DISCOVER: Making Vision Networks Interpretable via Competition and Dissection NeurIPS 2023 N/A ``
MultiMoDN—Multimodal, Multi-Task, Interpretable Modular Networks NeurIPS 2023 N/A ``
Causal Interpretation of Self-Attention in Pre-Trained Transformers NeurIPS 2023 N/A ``
Tracr: Compiled Transformers as a Laboratory for Interpretability NeurIPS 2023 N/A ``
Learning Interpretable Low-dimensional Representation via Physical Symmetry NeurIPS 2023 N/A ``
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models NeurIPS 2023 N/A ``
Transitivity Recovering Decompositions: Interpretable and Robust Fine-Grained Relationships NeurIPS 2023 N/A ``
GRAND-SLAMIN’ Interpretable Additive Modeling with Structural Constraints NeurIPS 2023 N/A ``
Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction NeurIPS 2023 N/A ``
GPEX, A Framework For Interpreting Artificial Neural Networks NeurIPS 2023 N/A ``
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers NeurIPS 2023 N/A ``
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP NeurIPS 2023 N/A ``
On the Identifiability and Interpretability of Gaussian Process Models NeurIPS 2023 N/A ``
BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis NeurIPS 2023 N/A ``
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance NeurIPS 2023 N/A ``
Evaluating Neuron Interpretation Methods of NLP Models NeurIPS 2023 N/A ``
FIND: A Function Description Benchmark for Evaluating Interpretability Methods NeurIPS 2023 N/A ``
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model NeurIPS 2023 N/A ``
Interpretable Prototype-based Graph Information Bottleneck NeurIPS 2023 N/A ``
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca NeurIPS 2023 N/A ``
M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models NeurIPS 2023 N/A ``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning EMNLP 2023 N/A ``
Towards Explainable and Accessible AI EMNLP 2023 N/A ``
KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing EMNLP 2023 N/A ``
INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback EMNLP 2023 N/A ``
Goal-Driven Explainable Clustering via Language Descriptions EMNLP 2023 N/A ``
VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights EMNLP 2023 N/A ``
COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation EMNLP 2023 N/A ``
Hop, Union, Generate: Explainable Multi-hop Reasoning without Rationale Supervision EMNLP 2023 N/A ``
GenEx: A Commonsense-aware Unified Generative Framework for Explainable Cyberbullying Detection EMNLP 2023 N/A ``
DRGCoder: Explainable Clinical Coding for the Early Prediction of Diagnostic-Related Groups EMNLP 2023 N/A ``
LLM4Vis: Explainable Visualization Recommendation using ChatGPT EMNLP 2023 N/A ``
Harnessing LLMs for Temporal Data - A Study on Explainable Financial Time Series Forecasting EMNLP 2023 N/A ``
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning EMNLP 2023 N/A ``
Distilling ChatGPT for Explainable Automated Student Answer Assessment EMNLP 2023 N/A ``
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models EMNLP 2023 N/A ``
Leveraging Structured Information for Explainable Multi-hop Question Answering and Reasoning EMNLP 2023 N/A ``
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning EMNLP 2023 N/A ``
Deep Integrated Explanations CIKM 2023 N/A ``
KG4Ex: An Explainable Knowledge Graph-Based Approach for Exercise Recommendation CIKM 2023 N/A ``
Interpretable Fake News Detection with Graph Evidence CIKM 2023 N/A ``
PriSHAP: Prior-guided Shapley Value Explanations for Correlated Features CIKM 2023 N/A ``
A Model-Agnostic Method to Interpret Link Prediction Evaluation of Knowledge Graph Embeddings CIKM 2023 N/A ``
ACGAN-GNNExplainer: Auxiliary Conditional Generative Explainer for Graph Neural Networks CIKM 2023 N/A ``
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries CIKM 2023 N/A ``
Explainable Spatio-Temporal Graph Neural Networks CIKM 2023 N/A ``
Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction CIKM 2023 N/A ``
Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations CIKM 2023 N/A ``
NOVO: Learnable and Interpretable Document Identifiers for Model-Based IR CIKM 2023 N/A ``
Counterfactual Monotonic Knowledge Tracing for Assessing Students' Dynamic Mastery of Knowledge Concepts CIKM 2023 N/A ``
Contrastive Counterfactual Learning for Causality-aware Interpretable Recommender Systems CIKM 2023 N/A ``
2023 N/A ``

2024

Title Venue Year Code Keywords Summary
Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models AAAI 2024 N/A ``
Evaluating Pre-trial Programs Using Interpretable Machine Learning Matching Algorithms for Causal Inference AAAI 2024 N/A ``
On the Importance of Application-Grounded Experimental Design for Evaluating Explainable ML Methods AAAI 2024 N/A ``
A Framework for Data-Driven Explainability in Mathematical Optimization AAAI 2024 N/A ``
Q-SENN: Quantized Self-Explaining Neural Networks AAAI 2024 N/A ``
LR-XFL: Logical Reasoning-Based Explainable Federated Learning AAAI 2024 N/A ``
Trade-Offs in Fine-Tuned Diffusion Models between Accuracy and Interpretability AAAI 2024 N/A ``
π-Light: Programmatic Interpretable Reinforcement Learning for Resource-Limited Traffic Signal Control AAAI 2024 N/A ``
Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations AAAI 2024 N/A ``
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention AAAI 2024 N/A ``
LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack AAAI 2024 N/A ``
Learning Robust Rationales for Model Explainability: A Guidance-Based Approach AAAI 2024 N/A ``
Explaining Generalization Power of a DNN Using Interactive Concepts AAAI 2024 N/A ``
Federated Causality Learning with Explainable Adaptive Optimizatio AAAI 2024 N/A ``
Learning Performance Maximizing Ensembles with Explainability Guarantees AAAI 2024 N/A ``
Towards Modeling Uncertainties of Self-Explaining Neural Networks via Conformal Prediction AAAI 2024 N/A ``
Towards Learning and Explaining Indirect Causal Effects in Neural Networks AAAI 2024 N/A ``
GINN-LP: A Growing Interpretable Neural Network for Discovering Multivariate Laurent Polynomial Equations AAAI 2024 N/A ``
Pantypes: Diverse Representatives for Self-Explainable Models AAAI 2024 N/A ``
Factorized Explainer for Graph Neural Networks AAAI 2024 N/A ``
Self-Interpretable Graph Learning with Sufficient and Necessary Explanations AAAI 2024 N/A ``
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning AAAI 2024 N/A ``
A General Theoretical Framework for Learning Smallest Interpretable Models AAAI 2024 N/A ``
Knowledge-Aware Explainable Reciprocal Recommendation AAAI 2024 N/A ``
Fine-Tuning Large Language Model Based Explainable Recommendation with Explainable Quality Reward AAAI 2024 N/A ``
Finding Interpretable Class-Specific Patterns through Efficient Neural Search AAAI 2024 N/A ``
Enhance Sketch Recognition’s Explainability via Semantic Component-Level Parsing AAAI 2024 N/A ``
B-spine: Learning B-spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation AAAI 2024 N/A ``
A Convolutional Neural Network Interpretable Framework for Human Ventral Visual Pathway Representation AAAI 2024 N/A ``
NeSyFOLD: A Framework for Interpretable Image Classification AAAI 2024 N/A ``
Knowledge-Aware Neuron Interpretation for Scene Classification AAAI 2024 N/A ``
MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment AAAI 2024 N/A ``
Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds AAAI 2024 N/A ``
Learning Audio Concepts from Counterfactual Natural Language ICASSP 2024 N/A ``
2024 N/A ``

awesome-explainable-ai's People

Contributors

gabrieleciravegna avatar jmd-ferreira avatar rushrukh avatar tgoprince avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

awesome-explainable-ai's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.