GithubHelp home page GithubHelp logo

viadee / javaanchoradapters Goto Github PK

View Code? Open in Web Editor NEW
5.0 11.0 0.0 6.63 MB

Getting the Anchors Explainer to work in Different Settings

License: BSD 3-Clause "New" or "Revised" License

Java 100.00%
ai java anchors machine-learning explainability

javaanchoradapters's Introduction

License Build Status Sonarcloud Coverage

JavaAnchorAdapters

Adapter [/əˈdaptə/] noun, a device for connecting pieces of equipment that cannot be connected directly.

This is a collection of tools that serve to make the Java implementation of the Anchors algorithm more easy to use. The algorithm (as introduced Marco Tulio Ribeiro, 2018) is model-agnostic, but the nature of the dataset needs to be considered.

This repository includes methodological aspects, i.e. default approaches on how to apply the algorithm to tabular data in typical use cases with tabular data (such as bpmn.ai), images or texts as well as technical aspects, such as running Anchors explanations on Apache Spark.

This project is to be considered research-in-progress.

JavaAnchorAlgorithm Repository

For more information on Anchors and this implementation, see main repository.

Exemplary Use / Tutorial

Examples of using the Anchors implementation and its various adapters are provided within the XAI Examples repository. Please refer to this project for tutorials and easy-to-run applications.

Collaboration

The project is operated and further developed by the viadee Consulting AG in Münster, Westphalia. Results from theses at the WWU Münster and the FH Münster have been incorporated.

  • Further theses are planned: Contact person is Dr. Frank Köhne from viadee. Community contributions to the project are welcome: Please open Github-Issues with suggestions (or PR), which we can then edit in the team. For general discussions please refer to the main repository.
  • We are looking for further partners who have interesting process data to refine our tooling as well as partners that are simply interested in a discussion about AI in the context of business process automation and explainability.

javaanchoradapters's People

Contributors

alexchrbn avatar dependabot[bot] avatar fkoehne avatar marvingronhorst avatar thllwg avatar tobiasgoerke avatar woldinius avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

javaanchoradapters's Issues

Anchors rules with transformed value and not discretized value

When running Anchors the rules it provides always contain the discretized value instead of the transformed value. For understandability we should use the transformed value instead. Example:
Instance:
Sex='male'
Survived='TRUE'
Rule:
IF Sex='male"
THEN PREDICT 1

Here TRUE is discretized with 1 but it should state TRUE in the Rule instead of 1.

Titanic tutorial?

The readme file would really benefit from a well known example including at least one reasonable anchor (and the steps required to re-create it).

Implement unsupervised & non-parameterized discretizer

It would be a Nice-To-Have to add an unsupervised discretizer without any parameters or userinput because this would be easier to understand for the average user. If such a discretizer exists it should be called or implemented.

Enable Supervised Discretization

Supervised Discretization might enable better classifications. In the current implementation the discretizers only have the continuous variable as a parameter. Supervised discretization additionally needs the target column.

Invalid POM

de.viadee.xai.anchor:DefaultConfigsAdapter has issues in it's pom which is eliminating transitive dependencies.

Coverage is skewed towards local instances

Coverage is calculated based upon the perturbation function.
The tabular perturbation function changes the non-fixed features randomly, given a predefined probability. As soon as this specified probability is < 1, non representative instances get generated, as the instance's value appear statistically more often.
This returns in a coverage that is not representative regarding the whole dataset.
If we, however, change the probability to 1 for generating coverage perturbations, we might violate anchor's theoretical description but make rules more intuitive.

Tabular perturbation is inaccurate when using discretization

The default tabular perturbation function currently takes a random instance and replaces the perturbed instance's values by the non-fixed feature values of the other instance.
The fixed values remain unchanged.
This is inaccurate when using discretization and even the fixed values should randomly change within their discretized class.

Overview of tabular preprocessing

As more operations such as transformations and discretizations are being automized, it gets harder for the user to comprehend how the dataset is actually being preprocessed.
There should be some kind of overview (e.g. R summary style) after the preprocessing steps to facilitate data understanding.

Hyperparameter spaces not immutable

The best hyperparameter space found by random search always points to the same parameter configuration and not the configuration of the best hyperparameter space object. List of parameters needs to be cloned.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.