GithubHelp home page GithubHelp logo

nashid / lampion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ciselab/lampion

0.0 0.0 0.0 14.98 MB

Metamorphic Transformations for ML-SE Robustness Analysis

License: MIT License

Java 56.51% Dockerfile 0.70% Shell 0.52% Python 27.81% Jupyter Notebook 13.10% R 0.55% Jinja 0.80%

lampion's Introduction

Lampion

This project aims to help you with explainability and robustness of your Codebased ML-Models. It is based on the idea of Metamorphic Transformations which alter the code-syntax but keep the meaning of the code. When applied to bytecode, this is often called Obfuscation, and examples are changing variablesnames or introducing dead code.

The provided Java-Transformer is configurable to provide a number of metamorphic transformations on SourceCode. We aim to be highly configurable and extendable for further research. A Transformer for Python is currently in early development.

Overview

Getting Started

Further information as well as instructions on the components can be in their sub-folders. Any experiment-reproduction code is placed in a seperate repository, to which the items in Experiments will guide you. An overview and reasoning on the structure can be found in the Design-Notes.

General information on design decisions can be found in the Design-Notes.

Examples and reasoning on the metamorphic transformations can be found in Transformations.md.

Related & Similar Work

The Paper Embedding Java Classes with code2vec: Improvements from Variable Obfuscation and it's accompanying repository investigate the impact of changing variable names on the performance / robustness of Code2Vec based models. With variable renaming being a subset of this work, it can be seen as related work with a different goal. Another similar work is from the code2vec authors with their paper Adversarial Examples for Models of Code. In their paper they use common adversarial generation to create certain (wrong) predictions by changing variable names or introducing newly, unused variables. This is another sub-part of this project, but instead of going for explainability they go straight for exploits on the system.

A very close related work is done by Rabin et al and is also available on Github. They have the same motivation and nearly identical approach, we only differ in experiment and evaluation. Sometimes the world is a strange place, that we both made it through reviews in parallel. Another related work is by Cito et al, where they use a language model to produce prediction-changing differences that are still human-readable. As far as I understood, the counterexamples do not necessarily have the same functionalities and are just for explainability.

A more precise differentiation and further work can be found in our publication.

Citing

If you want to cite this paper or repository, please use the following:

    @inproceedings{applis2021,
    title = "Assessing Robustness of ML-Based Program Analysis Tools using Metamorphic Program Transformations",
    keywords = "Metamorphic Testing, Machine Learning, Documentation Generation, Code-To-Text, Deep learning",
    author = "L.H. Applis and A. Panichella and {van Deursen}, A.",
    year = "2021",
    month = sep,
    booktitle = "IEEE/ACM International Conference on Automated Software Engineering - NIER Track",
    publisher = "IEEE / ACM",
    }

lampion's People

Contributors

lapplislazuli avatar dependabot[bot] avatar apanichella avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.