GithubHelp home page GithubHelp logo

research_malwarediscover's Introduction

Research_MalwareDiscover

整個project分為兩部分:(1) pre-processing-script (2) online-query

必須先執行(1), 才能用 (2)來進行search

Github:
上面共有三個主要項目, 一個是online-query的framework (written by java), 一個是pre-processing-script, 一個是batch-import (for neo4j)

(1) pre-processing-script

Notice: 資料夾目錄結構如下

Step1 :

  • 執行processReal.sh

    • 內含三個步驟
      • Notice: 注意各個script內的檔案讀寫之路徑
      • /home/arvin/Dropbox/Research/GTgraph-master/R-MAT /processRealDataProgram/RealData/mini_edge/'+str(self.partID)+'_part'
        • 紅色部分改成執行環境主目錄
    • python3 pg_mem&disk_loadToDB.py mrMiniEdge.csv
      • pg_mem&disk_loadToDB.py 會讀入 mrMiniEdge.csv (malware graph的邊), 並將之放入postgresql
    • python3 pg_disk_partition.py minitargetList.txt mrMiniEdge.csv 3 > order
      • pg_disk_partition.py 會讀入 minitargetList.txt (所有target的list, 以我的case來說就是my_reputation內所有的CVE, 並且以target為出發點, 擴增3層, 切割為一個subgraph (order file純粹用來存放過程中的一些output資訊, not important)
    • python3 pg_disk_remain_partition.py 3
      • 利用pg_disk_remain_partition.py將第二點切割後的剩餘點, group成多個subgraph

Step2 :

  • 執行cleanDupdata.py
    • python3 cleanDupdata “filepath” checkType
      • ex. python3 cleanDupdata RealData/mini_node node
      • python3 cleanDupdata RealData/mini_edge edge
    • 要分別執行 node 跟 edge, 才能去除掉重複的node跟edge (這樣在import到neo4j時, 才不會有問題)
      Step3 :
  • create兩個資料夾 real-importScript & real-target

Step4 :

  • 執行real-createImportScript.py
    • 這隻script會依照各個subgraph, 產生對應的batch-import所需之script, 並放入real-importScript資料夾內
    • batch-import是用來batch匯入資料至neo4j的tool

Step5:

  • 檢查從github載下來的batch-import資料夾內是否有以下目錄, 若無則create

    • mini-real-target
    • ResearchRealData
    • ResearchRealData/mini_Real
  • 將real-importScript複製到batch-import專案的資料夾底下
    +將importData.py放入batch-import專案資料夾內

  • 將Step1中的RealData/mini_edge和RealData/mini_node複製到batch=import/ResearchRealData/mini_Real/底下

  • 移至batch-import專案的資料夾

  • 執行importData.py

    • python3 importData.py &
      • 這步驟目的在於batch執行import script, 將資料匯入neo4j(會在mini-real-target中產生.db的資料夾)會需要一段時間執行
        以上五步驟將會執行完pre-processing-script, 將myreputation資料轉換並匯入neo4j

(2) Online-query

research_malwarediscover's People

Contributors

arvinh avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.