ciselab / lampion Goto Github PK

Metamorphic Transformations for ML-SE Robustness Analysis

License: MIT License

Java 57.89% Dockerfile 0.73% Shell 0.80% Python 28.27% Jupyter Notebook 10.86% R 0.67% Jinja 0.78%

lampion's Issues

Java Transformer: Increase Manifest Writing Speed

With the bigger experiments writing the manifest seems to be the bottleneck of the preprocessing pipeline, taking a few minutes for 100k transformations.
As there are no performance tweaks done at the moment, it would be a nice thing when there is some time left.

Proposed Solution

Use a Batch-Writing process where marked in The SQLite Manifest Writer.
Look for hints in the SQLite Library / JDBC on how to make it faster.

Possible Alternatives:

Add a flag to not write manifests when not required, maybe write transformations in a reduced spectrum.

Possible Problems:
There should be no problems, except that its not working.

Additional Context:

The JDBC Page on Batch processing

Check if this is applicable to slqite

Transformations in abstract classes with abstract methods give an error.

Describe the bug
The abstract methods in the abstract classes don't have a body, meaning that a part of the spoon library code returns null. This has not been handled meaning that it throws an exception.

To Reproduce
Steps to reproduce the behavior:

Run transformations on a class with only abstract methods

Expected behavior
What should happen, I think, is that the code ignores abstract methods.

Python Transformer - Some Tests are flaky

Some of the python tests fail in the docker container.
When duplicating the same test, the first fails and the second runs successfully.

Affected Tests:

These have not much in common - one is for strings one is for integers. Maybe there are side-effects in place.

To Reproduce
Steps to reproduce the behavior:

Go to the Python Transformer repository
Run docker build .

optionally, the python pytest ./tests also produces the same error.

Expected behavior
The tests should run regardless of their order.
The test pipeline should be succeeding.

Desktop (please complete the following information):
Happens in the Docker.
Does not happen when the test is run in PyCharm.

There have A bug in APPTest.java.

Describe the bug
When using command "mvn package" in "./Transformers/Java/', the TEST phase pop failures and error.

You can fix by editting APPTest.java 34 line.
private static String expectedJavaFile = "./lampion/test/examples/example.java"; =>
private static String expectedJavaFile = "./lampion/test/examples/Example.java";

Screenshots
Here is screenshots and surefire reports.

com.github.ciselab.lapion.cli.program.AppTest.txt

Java Transformer: Log4J2.xml is not used

Currently, the Log4j2.xml is not used.
Neither the patterns nor the loggers specified are used, instead the standard is used.

To Reproduce

Just run the preprocessing docker-compose,
it does not print timestamps which are defined as the pattern for.

Verification can be done as the behaviour is identical when the log4j file is deleted.

Expected behavior
The logging should be used, creating also a log-file.
The pattern of the xml should be hold, printing e.g. TimeStamps .

Additional context

I first thought that this was related to shading, and where to place the xml.
However, it seems to be a bit deeper as I tried to set the logger programmatically

    private static void configureLogger(){
        System.out.println("Programmatically setting Logger");

        ConfigurationBuilder<BuiltConfiguration> builder
                = ConfigurationBuilderFactory.newConfigurationBuilder();

        LayoutComponentBuilder standard
                = builder.newLayout("PatternLayout");
        standard.addAttribute("pattern", "%d{yyyy-MM-dd HH:mm:ss} %-1p %c{1} %m%n");



        AppenderComponentBuilder console
                = builder.newAppender("stdout", "Console");
        builder.add(console);

        AppenderComponentBuilder file
                = builder.newAppender("flog", "File");
        file.addAttribute("fileName", "target/lampion.log");

        builder.add(file);

        console.add(standard);
        file.add(standard);

        RootLoggerComponentBuilder rootLogger
                = builder.newRootLogger(Level.DEBUG);
        rootLogger.add(builder.newAppenderRef("stdout"));
        rootLogger.add(builder.newAppenderRef("flog"));

        builder.add(rootLogger);

        var context = Configurator.initialize(builder.build());

        logger = context.getRootLogger();


        //LogManager.getRootLogger();

        System.out.println("Logger set");
    }

And use this in the main before the logger is set.
However, it gave me the following Error :

2021-02-24 13:13:40,859 main ERROR LogManager returned an instance of org.apache.logging.slf4j.SLF4JLoggerContextFactory which does not implement org.apache.logging.log4j.core.impl.Log4jContextFactory. Unable to initialize Log4j.

So it seems to be a bit deeper with what library supports what.

This error obviously also appears when usign log4j2.properties file instead.

Java Transformer: Use Dictionaries for Pseudo-Random Strings

At the moment the pseudo-random strings are using animal names hardcoded in the program.
It would be nicer to provide 3 textfiles with "nouns","verbs" and "adjectives" and pick amongst them randomly.

Proposed Solution
On Program start, look for a set of specified files and draw the words from them.
Use the current behaviour as default if no files are found.

Possible Problems:

Encoding of Words can be a trouble

For uniqueness / reproducabiltiy generate and log a hash of the files.

Additional Context:
It might be super nice to specify a pattern to draw from for the transformers, e.g. a pattern how a comment looks in the style [noun,verb,adjevtive,noun] with quantifiers etc.

Splitting Java Transformer into Library + CLI

At the moment, the transformer is a single CLI Application that is in itself closed and runs starting from files and ending on changed files.
For the use in more sophisticated tasks (e.g. Search-Based), it is better if we can alter files more dynamically, than piping things over the console. Hence we are gonna separate into a library, that can be used more flexible, and a CLI that holds the current behaviour.

Proposed Solution

Engine: Change to take in a CST and return a CST
Engine: Maybe add an Engine-Result object to also bring additional information (number of failures, number of successes, transformers used, transformer-results)
Everywhere: Change logging to be generic library logging, not Log4j (find out how)
Pom: Change Engine to be a library
New: Move App.java into new CLI Project
Library: Add Rename-Variable-Transformer
Update Dockers
Update Readmes / Documentation
Update CI(s) & Push new Images

New Layout:

sample/
└── java
    ├── cli
    │   ├── Dockerfile
    │   ├── pom.xml
    │   └── [...]
    └── core
        ├── pom.xml
        └── [...]

Things to consider:

What scope of CST does the Engine need? At the moment it runs on CodeRoot, but can it also work on lower scopes?
Some reading on logging needs to be done
Maybe the engine-results can be stored as attributes in the engine - e.g. keep the failures as a class-attribute and ask for it with engine.getFailures(), engine.getTransformationResults(), etc. and have the engine return and take pure asts.

Possible Problems:

Logging might be very hard
Spoon might have some logic regarding file and file placements

Related Issues:
#65 can be done on the fly

Additional Context:
This is done as part of Rubens @wubero 's MSc

CodeBert-Preprocessor fails on unicode references

The CodeBert-Preprocessor fails to preprocess javafiles containing characters starting with "",
such as "\u00".
The resulting jsonl has an unescaped "" and fails to be parsed.

To Reproduce

Move in the CodeBert Preprocessing Folder
Add the content of error.txt to the example java file
Run the Preprocessing on the example java file using the docker-compose
Inspect the altered_java.jsonl for \u00 characters

Expected behavior

The Character should be properly escaped as \u00.
In any way, the resulting json must be correct.

Additional context

This was needed for the GridExperiment, and has been currently addressed by removing the 3 datapoints that have a \u in them from the test-data.

PickRandomMethod returns exception

Describe the bug
When running a transformer on a dataset with the perClass transformation scope the program tries to apply the transformers to every class. This includes enums. However, when running the pickRandomMethod function the list of all methods is empty. This means that the bound for picking a random method is 0 which is not allowed.

Expected behavior
I expect this to be handled by either logging what is wrong or just ignoring the enums for this scope.

Align Java and Python Transformer Console-Interaction

At the moment, there are 3 noteworthy differences, that might should be addressed

The python transformer takes the target as arguments, the java transformer has it in config
The java transformer has an "undo" task, cleaning the repository
The scopes are written slightly different (perClass and per_class)

These differences can be caught in the Docker container and hence still be a homogenous experience, but I wanted to note it here.

ciselab / lampion Goto Github PK

lampion's People

Contributors

Stargazers

Watchers

Forkers

lampion's Issues

Java Transformer: Increase Manifest Writing Speed

Transformations in abstract classes with abstract methods give an error.

Python Transformer - Some Tests are flaky

There have A bug in APPTest.java.

Java Transformer: Log4J2.xml is not used

Java Transformer: Use Dictionaries for Pseudo-Random Strings

Splitting Java Transformer into Library + CLI

CodeBert-Preprocessor fails on unicode references

PickRandomMethod returns exception

Align Java and Python Transformer Console-Interaction

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs