GithubHelp home page GithubHelp logo

ciselab / lampion Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 2.0 45.3 MB

Metamorphic Transformations for ML-SE Robustness Analysis

License: MIT License

Java 57.89% Dockerfile 0.73% Shell 0.80% Python 28.27% Jupyter Notebook 10.86% R 0.67% Jinja 0.78%

lampion's People

Contributors

apanichella avatar dependabot[bot] avatar lapplislazuli avatar wubero avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

wubero nashid

lampion's Issues

Java Transformer: Increase Manifest Writing Speed

With the bigger experiments writing the manifest seems to be the bottleneck of the preprocessing pipeline, taking a few minutes for 100k transformations.
As there are no performance tweaks done at the moment, it would be a nice thing when there is some time left.

Proposed Solution

Use a Batch-Writing process where marked in The SQLite Manifest Writer.
Look for hints in the SQLite Library / JDBC on how to make it faster.

Possible Alternatives:

Add a flag to not write manifests when not required, maybe write transformations in a reduced spectrum.

Possible Problems:
There should be no problems, except that its not working.

Additional Context:

The JDBC Page on Batch processing

Check if this is applicable to slqite

Transformations in abstract classes with abstract methods give an error.

Describe the bug
The abstract methods in the abstract classes don't have a body, meaning that a part of the spoon library code returns null. This has not been handled meaning that it throws an exception.

To Reproduce
Steps to reproduce the behavior:

  1. Run transformations on a class with only abstract methods

Expected behavior
What should happen, I think, is that the code ignores abstract methods.

Python Transformer - Some Tests are flaky

Some of the python tests fail in the docker container.
When duplicating the same test, the first fails and the second runs successfully.

Affected Tests:

These have not much in common - one is for strings one is for integers. Maybe there are side-effects in place.

To Reproduce
Steps to reproduce the behavior:

  1. Go to the Python Transformer repository
  2. Run docker build .

optionally, the python pytest ./tests also produces the same error.

Expected behavior
The tests should run regardless of their order.
The test pipeline should be succeeding.

Desktop (please complete the following information):
Happens in the Docker.
Does not happen when the test is run in PyCharm.

There have A bug in APPTest.java.

Describe the bug
When using command "mvn package" in "./Transformers/Java/', the TEST phase pop failures and error.

You can fix by editting APPTest.java 34 line.
private static String expectedJavaFile = "./lampion/test/examples/example.java"; =>
private static String expectedJavaFile = "./lampion/test/examples/Example.java";

Screenshots
Here is screenshots and surefire reports.
2024-03-04 201002

com.github.ciselab.lapion.cli.program.AppTest.txt

Java Transformer: Log4J2.xml is not used

Currently, the Log4j2.xml is not used.
Neither the patterns nor the loggers specified are used, instead the standard is used.

To Reproduce

Just run the preprocessing docker-compose,
it does not print timestamps which are defined as the pattern for.

Verification can be done as the behaviour is identical when the log4j file is deleted.

Expected behavior
The logging should be used, creating also a log-file.
The pattern of the xml should be hold, printing e.g. TimeStamps .

Additional context

I first thought that this was related to shading, and where to place the xml.
However, it seems to be a bit deeper as I tried to set the logger programmatically

    private static void configureLogger(){
        System.out.println("Programmatically setting Logger");

        ConfigurationBuilder<BuiltConfiguration> builder
                = ConfigurationBuilderFactory.newConfigurationBuilder();

        LayoutComponentBuilder standard
                = builder.newLayout("PatternLayout");
        standard.addAttribute("pattern", "%d{yyyy-MM-dd HH:mm:ss} %-1p %c{1} %m%n");



        AppenderComponentBuilder console
                = builder.newAppender("stdout", "Console");
        builder.add(console);

        AppenderComponentBuilder file
                = builder.newAppender("flog", "File");
        file.addAttribute("fileName", "target/lampion.log");

        builder.add(file);

        console.add(standard);
        file.add(standard);

        RootLoggerComponentBuilder rootLogger
                = builder.newRootLogger(Level.DEBUG);
        rootLogger.add(builder.newAppenderRef("stdout"));
        rootLogger.add(builder.newAppenderRef("flog"));

        builder.add(rootLogger);

        var context = Configurator.initialize(builder.build());

        logger = context.getRootLogger();


        //LogManager.getRootLogger();

        System.out.println("Logger set");
    }

And use this in the main before the logger is set.
However, it gave me the following Error :

2021-02-24 13:13:40,859 main ERROR LogManager returned an instance of org.apache.logging.slf4j.SLF4JLoggerContextFactory which does not implement org.apache.logging.log4j.core.impl.Log4jContextFactory. Unable to initialize Log4j.

So it seems to be a bit deeper with what library supports what.

This error obviously also appears when usign log4j2.properties file instead.

Java Transformer: Use Dictionaries for Pseudo-Random Strings

At the moment the pseudo-random strings are using animal names hardcoded in the program.
It would be nicer to provide 3 textfiles with "nouns","verbs" and "adjectives" and pick amongst them randomly.

Proposed Solution
On Program start, look for a set of specified files and draw the words from them.
Use the current behaviour as default if no files are found.

Possible Problems:

Encoding of Words can be a trouble

For uniqueness / reproducabiltiy generate and log a hash of the files.

Additional Context:
It might be super nice to specify a pattern to draw from for the transformers, e.g. a pattern how a comment looks in the style [noun,verb,adjevtive,noun] with quantifiers etc.

Splitting Java Transformer into Library + CLI

At the moment, the transformer is a single CLI Application that is in itself closed and runs starting from files and ending on changed files.
For the use in more sophisticated tasks (e.g. Search-Based), it is better if we can alter files more dynamically, than piping things over the console. Hence we are gonna separate into a library, that can be used more flexible, and a CLI that holds the current behaviour.

Proposed Solution

  • Engine: Change to take in a CST and return a CST
  • Engine: Maybe add an Engine-Result object to also bring additional information (number of failures, number of successes, transformers used, transformer-results)
  • Everywhere: Change logging to be generic library logging, not Log4j (find out how)
  • Pom: Change Engine to be a library
  • New: Move App.java into new CLI Project
  • Library: Add Rename-Variable-Transformer
  • Update Dockers
  • Update Readmes / Documentation
  • Update CI(s) & Push new Images

New Layout:

sample/
└── java
    ├── cli
    │   ├── Dockerfile
    │   ├── pom.xml
    │   └── [...]
    └── core
        ├── pom.xml
        └── [...]

Things to consider:

  • What scope of CST does the Engine need? At the moment it runs on CodeRoot, but can it also work on lower scopes?
  • Some reading on logging needs to be done
  • Maybe the engine-results can be stored as attributes in the engine - e.g. keep the failures as a class-attribute and ask for it with engine.getFailures(), engine.getTransformationResults(), etc. and have the engine return and take pure asts.

Possible Problems:

  • Logging might be very hard
  • Spoon might have some logic regarding file and file placements

Related Issues:
#65 can be done on the fly

Additional Context:
This is done as part of Rubens @wubero 's MSc

CodeBert-Preprocessor fails on unicode references

The CodeBert-Preprocessor fails to preprocess javafiles containing characters starting with "",
such as "\u00".
The resulting jsonl has an unescaped "" and fails to be parsed.

To Reproduce

  1. Move in the CodeBert Preprocessing Folder
  2. Add the content of error.txt to the example java file
  3. Run the Preprocessing on the example java file using the docker-compose
  4. Inspect the altered_java.jsonl for \u00 characters

Expected behavior

The Character should be properly escaped as \u00.
In any way, the resulting json must be correct.

Additional context

This was needed for the GridExperiment, and has been currently addressed by removing the 3 datapoints that have a \u in them from the test-data.

PickRandomMethod returns exception

Describe the bug
When running a transformer on a dataset with the perClass transformation scope the program tries to apply the transformers to every class. This includes enums. However, when running the pickRandomMethod function the list of all methods is empty. This means that the bound for picking a random method is 0 which is not allowed.

Expected behavior
I expect this to be handled by either logging what is wrong or just ignoring the enums for this scope.

Align Java and Python Transformer Console-Interaction

At the moment, there are 3 noteworthy differences, that might should be addressed

  • The python transformer takes the target as arguments, the java transformer has it in config
  • The java transformer has an "undo" task, cleaning the repository
  • The scopes are written slightly different (perClass and per_class)

These differences can be caught in the Docker container and hence still be a homogenous experience, but I wanted to note it here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.