ciselab / lampion Goto Github PK
View Code? Open in Web Editor NEWMetamorphic Transformations for ML-SE Robustness Analysis
License: MIT License
Metamorphic Transformations for ML-SE Robustness Analysis
License: MIT License
With the bigger experiments writing the manifest seems to be the bottleneck of the preprocessing pipeline, taking a few minutes for 100k transformations.
As there are no performance tweaks done at the moment, it would be a nice thing when there is some time left.
Proposed Solution
Use a Batch-Writing process where marked in The SQLite Manifest Writer.
Look for hints in the SQLite Library / JDBC on how to make it faster.
Possible Alternatives:
Add a flag to not write manifests when not required, maybe write transformations in a reduced spectrum.
Possible Problems:
There should be no problems, except that its not working.
Additional Context:
The JDBC Page on Batch processing
Check if this is applicable to slqite
Describe the bug
The abstract methods in the abstract classes don't have a body, meaning that a part of the spoon library code returns null. This has not been handled meaning that it throws an exception.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
What should happen, I think, is that the code ignores abstract methods.
Some of the python tests fail in the docker container.
When duplicating the same test, the first fails and the second runs successfully.
Affected Tests:
These have not much in common - one is for strings one is for integers. Maybe there are side-effects in place.
To Reproduce
Steps to reproduce the behavior:
docker build .
optionally, the python pytest ./tests
also produces the same error.
Expected behavior
The tests should run regardless of their order.
The test pipeline should be succeeding.
Desktop (please complete the following information):
Happens in the Docker.
Does not happen when the test is run in PyCharm.
Describe the bug
When using command "mvn package" in "./Transformers/Java/', the TEST phase pop failures and error.
You can fix by editting APPTest.java 34 line.
private static String expectedJavaFile = "./lampion/test/examples/example.java"; =>
private static String expectedJavaFile = "./lampion/test/examples/Example.java";
Currently, the Log4j2.xml is not used.
Neither the patterns nor the loggers specified are used, instead the standard is used.
To Reproduce
Just run the preprocessing docker-compose,
it does not print timestamps which are defined as the pattern for.
Verification can be done as the behaviour is identical when the log4j file is deleted.
Expected behavior
The logging should be used, creating also a log-file.
The pattern of the xml should be hold, printing e.g. TimeStamps .
Additional context
I first thought that this was related to shading, and where to place the xml.
However, it seems to be a bit deeper as I tried to set the logger programmatically
private static void configureLogger(){
System.out.println("Programmatically setting Logger");
ConfigurationBuilder<BuiltConfiguration> builder
= ConfigurationBuilderFactory.newConfigurationBuilder();
LayoutComponentBuilder standard
= builder.newLayout("PatternLayout");
standard.addAttribute("pattern", "%d{yyyy-MM-dd HH:mm:ss} %-1p %c{1} %m%n");
AppenderComponentBuilder console
= builder.newAppender("stdout", "Console");
builder.add(console);
AppenderComponentBuilder file
= builder.newAppender("flog", "File");
file.addAttribute("fileName", "target/lampion.log");
builder.add(file);
console.add(standard);
file.add(standard);
RootLoggerComponentBuilder rootLogger
= builder.newRootLogger(Level.DEBUG);
rootLogger.add(builder.newAppenderRef("stdout"));
rootLogger.add(builder.newAppenderRef("flog"));
builder.add(rootLogger);
var context = Configurator.initialize(builder.build());
logger = context.getRootLogger();
//LogManager.getRootLogger();
System.out.println("Logger set");
}
And use this in the main before the logger is set.
However, it gave me the following Error :
2021-02-24 13:13:40,859 main ERROR LogManager returned an instance of org.apache.logging.slf4j.SLF4JLoggerContextFactory which does not implement org.apache.logging.log4j.core.impl.Log4jContextFactory. Unable to initialize Log4j.
So it seems to be a bit deeper with what library supports what.
This error obviously also appears when usign log4j2.properties file instead.
At the moment the pseudo-random strings are using animal names hardcoded in the program.
It would be nicer to provide 3 textfiles with "nouns","verbs" and "adjectives" and pick amongst them randomly.
Proposed Solution
On Program start, look for a set of specified files and draw the words from them.
Use the current behaviour as default if no files are found.
Possible Problems:
Encoding of Words can be a trouble
For uniqueness / reproducabiltiy generate and log a hash of the files.
Additional Context:
It might be super nice to specify a pattern to draw from for the transformers, e.g. a pattern how a comment looks in the style [noun,verb,adjevtive,noun] with quantifiers etc.
At the moment, the transformer is a single CLI Application that is in itself closed and runs starting from files and ending on changed files.
For the use in more sophisticated tasks (e.g. Search-Based), it is better if we can alter files more dynamically, than piping things over the console. Hence we are gonna separate into a library, that can be used more flexible, and a CLI that holds the current behaviour.
Proposed Solution
New Layout:
sample/
└── java
├── cli
│ ├── Dockerfile
│ ├── pom.xml
│ └── [...]
└── core
├── pom.xml
└── [...]
Things to consider:
engine.getFailures()
, engine.getTransformationResults()
, etc. and have the engine return and take pure asts.Possible Problems:
Related Issues:
#65 can be done on the fly
Additional Context:
This is done as part of Rubens @wubero 's MSc
The CodeBert-Preprocessor fails to preprocess javafiles containing characters starting with "",
such as "\u00".
The resulting jsonl has an unescaped "" and fails to be parsed.
To Reproduce
Expected behavior
The Character should be properly escaped as \u00.
In any way, the resulting json must be correct.
Additional context
This was needed for the GridExperiment, and has been currently addressed by removing the 3 datapoints that have a \u in them from the test-data.
Describe the bug
When running a transformer on a dataset with the perClass transformation scope the program tries to apply the transformers to every class. This includes enums. However, when running the pickRandomMethod function the list of all methods is empty. This means that the bound for picking a random method is 0 which is not allowed.
Expected behavior
I expect this to be handled by either logging what is wrong or just ignoring the enums for this scope.
At the moment, there are 3 noteworthy differences, that might should be addressed
These differences can be caught in the Docker container and hence still be a homogenous experience, but I wanted to note it here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.