bric3 / drain-java Goto Github PK
View Code? Open in Web Editor NEWThis a pet project to explore log pattern extraction using DRAIN
License: Mozilla Public License 2.0
This a pet project to explore log pattern extraction using DRAIN
License: Mozilla Public License 2.0
The watchservice of the JVM suffers from a few drawbacks regarding its integration with the OS. In Linux in particular events of bind mounts are not received.
Let's investigate alternative, in particular the gradle native integration : https://github.com/gradle/native-platform
+ implementation("net.rubygrapefruit:file-events:0.22")
+ implementation("net.rubygrapefruit:native-platform:0.22")
Currently file watching capabilities just appeared in a 0.22 milestone, unfortunately this is not completely released (platform specific native libraries are not published on bintray (for the published milestone)).
To follow https://github.com/gradle/native-platform/releases
However native-platform:0.21
is available on it's possible to play with some api like the terminal or files, e.g. :
try {
Terminals terminals = Native.get(Terminals.class);
var isTerminal = terminals.withAnsiOutput().isTerminal(Output.Stdout);
if (isTerminal) {
var terminal = terminals.withAnsiOutput().getTerminal(Output.Stdout);
terminal.write("Hello");
SECONDS.sleep(5);
terminal.cursorStartOfLine()
.clearToEndOfLine()
.bold().write("Bold hello")
.reset();
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
Hint: use https://jreleaser.org/
On Windows the build fails with
MappedFileLineReaderTest.find_start_position_given_last_lines()
org.opentest4j.AssertionFailedError:
expected: 42L
but was : 43L
MappedFileLineReaderTest.can_read_from_position()
org.opentest4j.AssertionFailedError:
expected: 183L
but was : 186L
MappedFileLineReaderTest.should_watch_with_channel_sink(Path)
java.io.IOException: Failed to delete temp directory C:\Users\RUNNER~1\AppData\Local\Temp\junit8439001560896197356. The following paths could not be deleted (see suppressed exceptions for details): , test4653189040998961269log
On MacOS the build fails with
org.opentest4j.AssertionFailedError:
expected: 592L
but was : 38L
Drain algorithm is a log mining algorithm, it's idea is to find patterns and group similar log event's message.
The good practice is to make the miner to process only the message part, ie strip elements like the date, the severity, the thread, the name.
Yet it might be interesting to keep some of these information. For example the severity or the log name, are unlikely to have a high cardinality, and maybe good candidate as log cluster metadata.
2021-03-29 12:55:24.172 [] DEBUG --- [ restartedMain] o.s.b.w.s.ServletContextInitializerBeans : Mapping filters: filterRegistrationBean urls=[/*] order=-2147483647, requestContextFilter urls=[/*] order=-1, contextServletRequestFilter urls=[/*] order=-2147483648, characterEncodingFilter urls=[/*] order=-2147483648, edgeRequestContextFilter urls=[/*] order=-2147483646, hideEdgeTechnicalEndpointsFilter urls=[/*] order=-2147483646, enableDebugLogsFilter urls=[/*] order=-2147483645, newrelicTransactionsFilter urls=[/*] order=-2147483645, accountingFilter urls=[/*] order=-2147483644, formContentFilter urls=[/*] order=-9900, disabledForwardedHeaderFilter urls=[/*] order=2147483647
2021-03-29 12:55:24.173 [] DEBUG --- [ restartedMain] o.s.b.w.s.ServletContextInitializerBeans : Mapping servlets: metricsService urls=[/metrics], dispatcherServlet urls=[/rest/*, /doc/*, /actuator/*, /error/*, /favicon.ico], com.blablacar.common.java.web.JerseyConfig urls=[/*]
2021-03-29 12:55:24.554 [] INFO --- [ restartedMain] o.s.b.a.e.web.EndpointLinksResolver : Exposing 3 endpoint(s) beneath base path '/actuator'
2021-03-29 12:55:24.606 [] INFO --- [ restartedMain] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'applicationTaskExecutor'
2021-03-29 12:55:24.972 [] INFO --- [ restartedMain] o.s.b.d.a.OptionalLiveReloadServer : LiveReload server is running on port 35729
Currently in drain mode, the discovered log clusters are only dumped (printed) when the log file has been entirely read.
This is not suitable when it is needed to watch a log file, there should be some mechanism to print the clusters
kill -l
, without overriding the standard ones that are handled by the JVM already)ctrl
+d
Currently the code understand a line as a log event message. However in some production systems, the application can use structured logging using a Json document. The document may contains many additional metadata, or context data, but what drain is interested in is the message, for this reason it has to be able to extract the message from the document given a path.
Usually log events are serialized in a single line json document, see logstash-logback-encoder for example. So when parsing json, the events will be assumed to be single line. However the message itself may be multiline (new line are likely encoded as \n
).
So the only work to do is to pre-process the string line as a Json and extract the message field.
In #5 introced a interesting feature to look for the cluster of a certain log.
The method findLogMessage
will only look for an existing log cluster. This might be interesting to implement for search feature.
Thanks to @TodorKrIv for the idea and initial implementation.
Gunnar Morling started a project there to characteristics using JFR events : https://github.com/gunnarmorling/jfrunit
repositories {
jcenter()
+ maven { setUrl("https://jitpack.io") }
}
dependencies {
+ testImplementation("com.github.gunnarmorling:jfrunit:main-SNAPSHOT")
}
Hello, did you port https://github.com/IBM/Drain3 or original logpai implementation?
In readme - you mension IBM gays, but i see code mismaches (may be it's because IBM project activly updated).
Currently the code is only able to process single line log messages. However it's possible to have multiline log messages.
Scope
In particular this ticket is about handling stacktraces, which usually starts with with whitespaces. I am not familiar with stacktraces in other languages, so the goal of this ticket is to focus on Java stack traces that may appear in a log trail.
Out of scope
Currently the code uses really simply tricks to mask some elements of a log event, eg by stripping the date component.
However there are other dynamic log components that may be worth to mask, IPs, UUIDs, etc.
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
This repository currently has no open or pending branches.
.github/workflows/gradle.yml
actions/checkout v4
gradle/actions v3
actions/checkout v4
actions/setup-java v4
gradle/actions v3
actions/upload-artifact v4
actions/download-artifact v4
mikepenz/action-junit-report v4
actions/checkout v4
actions/setup-java v4
gradle/actions v3
settings.gradle.kts
build.gradle.kts
drain-java-bom/build.gradle.kts
drain-java-core/build.gradle.kts
drain-java-jackson/build.gradle.kts
gradle/libs.versions.toml
com.google.code.findbugs:jsr305 3.0.2
info.picocli:picocli 4.7.6
info.picocli:picocli-codegen 4.7.6
com.fasterxml.jackson.core:jackson-core 2.17.2
com.fasterxml.jackson.core:jackson-annotations 2.17.2
com.fasterxml.jackson.core:jackson-databind 2.17.2
org.assertj:assertj-core 3.26.3
org.junit.jupiter:junit-jupiter-api 5.10.3
org.junit.jupiter:junit-jupiter-engine 5.10.3
de.undercouch.download 5.6.0
com.github.johnrengelman.shadow 8.1.1
com.github.ben-manes.versions 0.51.0
com.github.hierynomus.license 0.16.1
com.github.vlsi.gradle-extensions 1.90
nebula.release 19.0.10
tailer/build.gradle.kts
gradle/wrapper/gradle-wrapper.properties
gradle 8.9
Currently I only offered hints in the README, but I definitely need to spend some time on the documentation of
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.