GithubHelp home page GithubHelp logo

brain's Introduction

Brain

Brain: Log Parsing with Bidirectional Parallel Tree

IEEE Transaction on Severice Computing

Notice: If you encounter any issues when using Brain or need technical support, please don't hesitate to contact me. Brain is capable of being efficient and effective on large datasets, and the simplized version of Brain can be even more efficient. I'll update the code when I have the time, and my responses to the issues you report will be more timely.

LOGPAI

-Code of Brain in LOGPAI

ABSTRACT

Automated log analysis can facilitate failure diagnosis for developers and operators using a large volume of logs. Log parsing is a prerequisite step for automated log analysis, which parses semi-structured logs into structured logs. However, existing parsers are difficult to apply to software-intensive systems, due to their unstable parsing accuracy on various software. Although neural network-based approaches are stable, their inefficiency makes it challenging to keep up with the speed of log production. In this work, we found that template words of each log will have the same and highest frequency if different logging statements do not generate the identical constant and variable. Inspired by this key insight, we propose a bidirectional tree structure whose two directions are used to distinguish the identical constants and variables generated from different logging statements, respectively. The nodes of the generated final tree contain the classification of each word. Experimental results on 16 benchmark datasets show that our approach outperforms the state-of-the-art parsers on two widely-used parsing accuracy metrics, and it only takes around 46 seconds to process one million lines of logs.

Requirments

1.pip install -r requirements.txt

Reproduce

1.python  evaluate.py

Parsing result wiil be saved in Parseresult/

Results

img.png

Docker images:

1. docker pull docker.io/gaiusyu/brain:v2
2. docker run -it --name brain gaiusyu/brain:v2

Experimental data is saved in ExperimentalData.docx

citation

@article{yu2023brain,
  title={Brain: Log Parsing with Bidirectional Parallel Tree},
  author={Yu, Siyu and He, Pinjia and Chen, Ningjiang and Wu, Yifan},
  journal={IEEE Transactions on Services Computing},
  year={2023},
  publisher={IEEE}
}

brain's People

Contributors

gaiusyu avatar paperreviewww avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

brain's Issues

MemoryError

I used drian in lopai to parse the thunderbird dataset (29.8gb) without getting MemoryError, but I did get MemoryError when I parsed the split thunderbird dataset (2.92gb) using Brain in logpai. When I parse a 1000m thunderbird dataset there is no MemoryErro.Why is that? Can Brain only parse data sets around 1gb in size?
Traceback (most recent call last):
File "E:\logbert-main\TBird\data_process.py", line 137, in
parse_log(data_dir, output_dir, log_file, parser_type)
File "E:\logbert-main\TBird\data_process.py", line 77, in parse_log
parser.parse(log_file)
File "E:\logbert-main\TBird..\logparser\Brain.py", line 58, in parse
group_len, tuple_vector, frequency_vector = self.get_frequecy_vector(
File "E:\logbert-main\TBird..\logparser\Brain.py", line 261, in get_frequecy_vector
set.setdefault(str(lenth), []).append(token)
MemoryError

Group_accuracy

when I run your code, the group accuracy result comes equal to the accuracy result which is not true, and it does not match the results in your paper.
would you please give help in this

During handling of the above exception, another exception occurred: MemoryError

I'm running out of memory when I'm running the Thunderbird dataset(29.8gb) with Logpai's brain, is there a way to solve it?

Traceback (most recent call last):
File "..\logparser\Brain.py", line 189, in tuple_generate
result = number.most_common()
File "C:\Users\3730.conda\envs\logbert\lib\collections_init_.py", line 610, in most_common

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.