GithubHelp home page GithubHelp logo

lishiyucn / codebert-based-webshell-detection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lyccol/codebert-based-webshell-detection

0.0 0.0 0.0 197 KB

使用CodeBERT来webshell classfication

Python 100.00%

codebert-based-webshell-detection's Introduction

news 等找到实习后,整理一下代码,供大家方便使用。

CodeBERT-based-webshell-detection

BERT : https://arxiv.org/abs/1810.04805

CodeBERT:sxjscience/CodeBERT: CodeBERT (github.com)

我们使用CodeBERT作为预训练模型,作为模型的encode部分,使用TextCNN作为decoder部分对php代码进行二分类Fine-Tune,其中webshell为黑样本,其他为白样本。

Code Documentation Generation

Dependency

  • pip install torch==1.7.1
  • pip install transformers==4.0.1
  • pip install filelock more_itertools

database

数据集黑样本为github中开源项目tennc/webshell: This is a webshell open source project (github.com)

和论文Webshell Detection Based on the Word Attention Mechanism中的数据集:leett1/Programe (github.com)

取出php文件并去重之后为2000多个。

白样本约8000个。

由于数据集质量不高,Acc仅供参考

Accuracy Score = 1

F1 Score (Micro) = 1

F1 Score (Macro) = 1

数据集和模型太大所以没有上传,有疑问[email protected]

codebert-based-webshell-detection's People

Contributors

lyccol avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.