Light

gaiusyu / brain Goto Github PK

View Code? Open in Web Editor NEW

27.0 1.0 6.0 4.42 MB

Brain: Log Parsing with Bidirectional Parallel Tree

Python 99.81% Dockerfile 0.19%

aiops logparser logparsing software-engineering

brain's Introduction

Brain

Brain: Log Parsing with Bidirectional Parallel Tree

IEEE Transaction on Severice Computing

Notice: If you encounter any issues when using Brain or need technical support, please don't hesitate to contact me. Brain is capable of being efficient and effective on large datasets, and the simplized version of Brain can be even more efficient. I'll update the code when I have the time, and my responses to the issues you report will be more timely.

LOGPAI

-Code of Brain in LOGPAI

ABSTRACT

Automated log analysis can facilitate failure diagnosis for developers and operators using a large volume of logs. Log parsing is a prerequisite step for automated log analysis, which parses semi-structured logs into structured logs. However, existing parsers are difficult to apply to software-intensive systems, due to their unstable parsing accuracy on various software. Although neural network-based approaches are stable, their inefficiency makes it challenging to keep up with the speed of log production. In this work, we found that template words of each log will have the same and highest frequency if different logging statements do not generate the identical constant and variable. Inspired by this key insight, we propose a bidirectional tree structure whose two directions are used to distinguish the identical constants and variables generated from different logging statements, respectively. The nodes of the generated final tree contain the classification of each word. Experimental results on 16 benchmark datasets show that our approach outperforms the state-of-the-art parsers on two widely-used parsing accuracy metrics, and it only takes around 46 seconds to process one million lines of logs.

Requirments

1.pip install -r requirements.txt

Reproduce

1.python  evaluate.py

Parsing result wiil be saved in Parseresult/

Results

Docker images:

1. docker pull docker.io/gaiusyu/brain:v2
2. docker run -it --name brain gaiusyu/brain:v2

Experimental data is saved in ExperimentalData.docx

citation

@article{yu2023brain,
  title={Brain: Log Parsing with Bidirectional Parallel Tree},
  author={Yu, Siyu and He, Pinjia and Chen, Ningjiang and Wu, Yifan},
  journal={IEEE Transactions on Services Computing},
  year={2023},
  publisher={IEEE}
}

brain's People

Contributors

Stargazers

Watchers

Forkers

opium1715 sd2k brogao coder-chenzhi raulmelofernandez wpcwpcwpc

brain's Issues

MemoryError

I used drian in lopai to parse the thunderbird dataset (29.8gb) without getting MemoryError, but I did get MemoryError when I parsed the split thunderbird dataset (2.92gb) using Brain in logpai. When I parse a 1000m thunderbird dataset there is no MemoryErro.Why is that? Can Brain only parse data sets around 1gb in size?
Traceback (most recent call last):
File "E:\logbert-main\TBird\data_process.py", line 137, in
parse_log(data_dir, output_dir, log_file, parser_type)
File "E:\logbert-main\TBird\data_process.py", line 77, in parse_log
parser.parse(log_file)
File "E:\logbert-main\TBird..\logparser\Brain.py", line 58, in parse
group_len, tuple_vector, frequency_vector = self.get_frequecy_vector(
File "E:\logbert-main\TBird..\logparser\Brain.py", line 261, in get_frequecy_vector
set.setdefault(str(lenth), []).append(token)
MemoryError

Group_accuracy

when I run your code, the group accuracy result comes equal to the accuracy result which is not true, and it does not match the results in your paper.
would you please give help in this

During handling of the above exception, another exception occurred: MemoryError

I'm running out of memory when I'm running the Thunderbird dataset（29.8gb） with Logpai's brain, is there a way to solve it?

Traceback (most recent call last):
File "..\logparser\Brain.py", line 189, in tuple_generate
result = number.most_common()
File "C:\Users\3730.conda\envs\logbert\lib\collections_init_.py", line 610, in most_common

Invitation to contribute to LOGPAI

-Invitation to contribute to LOGPAI-

Hi, this is an invitation from LOGPAI, which is an open-source project towards building log anlaytics solutions powered by AI.

Would you like to contribute your project to LOGPAI?
https://github.com/logpai#-call-for-contributions

Comparing between the different object (int == tuple)

https://github.com/gaiusyu/Brain/blob/4b9151d16583cf5b558aa6f251eb67cfd2a3b546/Brain/Brain.py#L197C39-L197C39

type conflicts?

In the code:
https://github.com/gaiusyu/Brain/blob/main/Brain/Brain.py#L197
i.e.,
if father[0] == root_set_detail[key][i][k]:

however, father[0] is of type int, whereas root_set_detail[key][i][k] is of type tuple.
So, how to compare these two quantities?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs