GithubHelp home page GithubHelp logo

lchang1977 / knowledge-enhanced-attack-graph Goto Github PK

View Code? Open in Web Editor NEW

This project forked from li-zhenyuan/knowledge-enhanced-attack-graph

0.0 0.0 0.0 4.34 MB

AttacKG: Constructing Knowledge-enhanced Attack Graphs from Cyber Threat Intelligence Reports

License: MIT License

Python 2.63% HTML 46.66% Jupyter Notebook 50.71%

knowledge-enhanced-attack-graph's Introduction

Knowledge-enhanced-Attack-Graph

Instructions

Setup:

python 3.8

pip install -r requirements.txt

Running :

# Generating attack graph for CTI report
python main.py -M attackGraphGeneration -R "./Dataset/Evaluation/Frankenstein Campaign.txt" -O ./output.pdf
# Identifing techniques in CTI report
python main.py -M techniqueIdentification -T ./templates -R "./Dataset/Evaluation/Frankenstein Campaign.txt" -O ./output.pdf

Running - Archive-v0.1 (Archive-v0.1 is the experimental version without clear code structure and comments):

# Generating attack graph for CTI report
python attackGraph.py
# Identifing techniques in CTI report
python techniqueIdentifier.py
  • Sample Input and output can be found here.
  • Data Model required can bed found here.

Paper

AttacKG: Constructing Knowledge-enhanced Attack Graphs from Cyber Threat Intelligence Reports

Cyber attacks are becoming more sophisticated and diverse, making detection increasingly challenging. To combat these attacks, security practitioners actively summarize and exchange their knowledge about attacks across organizations in the form of cyber threat intelligence (CTI) reports. However, as CTI reports written in natural language texts are not structured for automatic analysis, the report usage requires tedious efforts of manual cyber intelligence recovery. Additionally, individual reports typically cover only a limited aspect of attack patterns (techniques) and thus are insufficient to provide a comprehensive view of attacks with multiple variants.

To take advantage of threat intelligence delivered by CTI reports, we propose AttacKG to automatically extract structured attack behavior graphs from CTI reports and identify the adopted attack techniques. We then aggregate cyber intelligence across reports to collect different aspects of techniques and enhance attack behavior graphs as technique knowledge graphs (TKGs). Such TKGs with technique-level intelligence directly benefit downstream security tasks that rely on technique specifications, e.g., Advanced Persistent Threat (APT) detection.

In our evaluation against 1,515 real-world CTI reports from diverse intelligence sources, AttacKG effectively identifies 28,262 attack techniques with 8,393 unique Indicators of Compromises (IoCs). Further, to verify AttacKG's accuracy in extracting threat intelligence, we run AttacKG on eight manually labeled CTI reports. Empirical results show that AttacKG accurately identifies attack-relevant entities, dependencies, and techniques with F1-scores of 0.895, 0.911, and 0.819, which significantly outperforms the state-of-the-art approaches like EXTRACTOR~\cite{Satvat2021} and TTPDrill~\cite{Husari2017}.'''

System Architecture

Framework

Figure 1. Overview of AttacKG architecture.

AttacKG takes two inputs: (1) technique procedure examples from MITRE describing individual attack techniques; (2) real-world CTI reports describing end-to-end attack workflow, and provides two outputs: (1) technique templates that aggregate technique-level threat intelligence across reports; (2) attack graphs enhanced with aggregated technique intelligence.

Motivating Example

Example

Figure 1. Motivating example.

It is a typical multi-stage attack campaign that nowadays consists of multiple atomic techniques. To evade detection, such attacks can be morphed easily by replacing any technique with an alternative one. Therefore, it is recommended to detect and investigate cyber attacks at the technical level, which is more robust and semantically richer.

Research progress has been made to automatically extract knowledge about attacks from CTI reports. Subfigures (B) to (D) show the information retrieved from the report sample by EXTRACTOR, and ChainSmith, respectively, while Subfigure (A) represents the manually generated ground-truth. Subfigure (D) presents the attack graph generated by EXTRACTOR with IoCs and dependencies. Note that EXTRACTOR aggregates all non-IoC entities of the same type and thus loses the structural information of attack behaviors, making it impossible to identify the technique accurately. Subfigure (B) shows attack techniques identified by TTPDrill with manual-defined threat ontology. As shown, TTPDrill can only extract separate techniques from CTI reports without the whole picture. Besides, the ontologies provided by TTPDrill containing only action-object pairs for technique identification are too vague to lead to many false positives. As the example shows, sending a document is recognized as exfiltration in TTPDrill. However, in this scenario, the ``trojanized'' document is sent by an attacker for exploitation. Detailed comparisons are demonstrated in Section~\ref{sec:evaluation}. As shown in Subfigure (C), ChainSmith provides a semantic layer on top of IoCs that captures the role of IoCs in a malicious campaign. They only give a coarse-grained four-stage classification with limited information.

Subfigure (E) illustrates the ideal result we want to extract in this paper. As long as we can identify attack techniques in attack graphs extracted from CTI reports, we are able to enrich the attack graphs with more comprehensive knowledge about the corresponding techniques. For example, we can find more possible vulnerabilities that can be used in \texttt{T1203 - Exploitation for Execution} as a replacement for \texttt{CVE-2017-11882} appeared in this report. Moreover, the distinct threat intelligence can be collected and aggregated at the technique level across massive reports.

knowledge-enhanced-attack-graph's People

Contributors

li-zhenyuan avatar stwater20 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.