GithubHelp home page GithubHelp logo

mover's Introduction

mover

基于预训练模型的抽取式告警摘要框架,通过有监督模型预测结果与无监督日志模版提取方法结合优化摘要结果。

包含便捷的训练、评估及便捷的服务部署。模型支持增量学习,支持使用GPU或者CPU训练或预测。

image-20220105100555235

使用说明

python版本>=3.7

安装深度学习框架时根据情况安装对应GPU(和CUDA版本对应)或者CPU版本

  1. 进入项目根目录,此时目录应该如下所示

    mover
    ├── conf.d
    ├── mover
    ├── static
    ├── README.md
    └── requirements.txt

  2. 执行以下命令安装依赖

    pip3 install -r install requirements.txt
  3. 进入代码目录**「cd mover」**,目录如下

    mover
    ├── action
    ├── config
    ├── model
    ├── tools
    ├── init.py
    └── main.py

  4. 按照需求关联指定的配置文件后执行模式**「python3 main.py --help」**

    usage: main.py [-h] --config CONFIG [--check_dir CHECK_DIR]
                [--mode {service,train,eval}]
    
    Mover命令行参数
    
    optional arguments:
    -h, --help            show this help message and exit
    
    --config CONFIG, -f CONFIG      默认加载的配置文件路径,格式为json
    
    --check_dir CHECK_DIR, -c CHECK_DIR      默认加载的模型参数文件路径
    
    --mode {service,train,eval}, -m {service,train,eval}    运行模式

运行模式

service(web服务)

  • 强制参数:check_dir

  • 参考配置

    {
      "model_name": "ernie",
      "check_dir": "/root/bowaer/logtree/bert-train/model/mark_ernie_seg/",
      "use_gpu": true
    }
  • 运行示例

    python3 main.py -f ../conf.d/ernie.json -m service

train(模型训练)

  • 强制参数:train_data_dir, eval_data_dir

  • 参考配置

    {
      "model_name": "ernie",
      "use_gpu": true,
      "train_data_dir": "/root/bowaer/logtree/output/train-ceb-2020_mark_seg.data",
      "eval_data_dir": "/root/bowaer/logtree/output/eval-ceb-2020_mark.data"
    }
  • 运行示例

    python3 main.py -f ../conf.d/ernie.json -m train

eval(评估模型)

  • 强制参数:train_data_dir, eval_data_dir

  • 参考配置

    {
      "model_name": "ernie",
      "use_gpu": true,
      "train_data_dir": "/root/bowaer/logtree/output/train-ceb-2020_mark_seg.data",
      "eval_data_dir": "/root/bowaer/logtree/output/eval-ceb-2020_mark.data"
    }
  • 运行示例

    python3 main.py -f ../conf.d/ernie.json -m eval

参数配置

公共参数

# 模型名称
MODEL_NAME = 'ernie'

# 模型加载和保存路径,默认使用原始模型安装路径
CHECK_DIR = ''

# 模型参数路径
MODEL_DIR = ''

# 是否使用GPU
USE_GPU = False

# 分词词典
WORD_DIR = f'{BASE_DIR}/../static/words.txt'

# 预训练模型字典
VOCAB_DIR = f'{BASE_DIR}/../static/vocab.txt'

# 停用词文件
STOP_WORD_DIR = f'{BASE_DIR}/../static/stop_words.txt'

训练参数

# 训练数据路径
TRAIN_DATA_DIR = ''

# 验证集数据路径
EVAL_DATA_DIR = ''

# 学习率
LEARNING_RATE = 5e-5

# epoch数量
EPOCHS = 1

# 每批训练数据量
BATCH_SIZE = 64

mover's People

Contributors

lotcher avatar

Stargazers

 avatar  avatar

Watchers

James Cloos avatar  avatar

mover's Issues

Suggest to loosen the dependency on pkuseg

Hi, your project mover requires "pkuseg==0.0.25" in its dependency. After analyzing the source code, we found that some other versions of pkuseg can also be suitable without affecting your project, i.e., pkuseg 0.0.21, 0.0.22. Therefore, we suggest to loosen the dependency on pkuseg from "pkuseg==0.0.25" to "pkuseg>=0.0.21,<=0.0.25" to avoid any possible conflict for importing more packages or for downstream projects that may use mover.

May I pull a request to loosen the dependency on pkuseg?

By the way, could you please tell us whether such dependency analysis may be potentially helpful for maintaining dependencies easier during your development?



For your reference, here are details in our analysis.

Your project mover(commit id: 69fb412) directly uses 1 APIs from package pkuseg.

pkuseg.__init__.pkuseg.__init__

From which, 35 functions are then indirectly called, including 12 pkuseg's internal APIs and 23 outsider APIs, as follows (neglecting some repeated function occurrences).

[/lotcher/mover]
+--pkuseg.__init__.pkuseg.__init__

We scan pkuseg's versions among [0.0.21, 0.0.22] and 0.0.25, the changing functions (diffs being listed below) have none intersection with any function or API we mentioned above (either directly or indirectly called by this project).

diff: 0.0.25(original) 0.0.21
['pkuseg.__init__.pkuseg', 'pkuseg.__init__.TrieNode', 'pkuseg.__init__.pkuseg.cut', 'pkuseg.__init__.Preprocesser.__init__', 'pkuseg.__init__.Preprocesser.solve', 'pkuseg.__init__.Preprocesser', 'pkuseg.__init__.Preprocesser.insert', 'pkuseg.__init__.TrieNode.__init__']

diff: 0.0.25(original) 0.0.22
[](no clear difference between the source codes of two versions)

As for other packages, the APIs of @outside_package_name are called by pkuseg in the call graph and the dependencies on these packages also stay the same in our suggested versions, thus avoiding any outside conflict.

Therefore, we believe that it is quite safe to loose your dependency on pkuseg from "pkuseg==0.0.25" to "pkuseg>=0.0.21,<=0.0.25". This will improve the applicability of mover and reduce the possibility of any further dependency conflict with other projects/packages.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.