GithubHelp home page GithubHelp logo

oxaliscu / ustc-tk2016 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yungshenglu/ustc-tk2016

0.0 0.0 0.0 50 KB

Toolkit for processing PCAP file and transform into image of MNIST dataset

License: Mozilla Public License 2.0

Python 52.29% PowerShell 47.71%

ustc-tk2016's Introduction

USTC-TK2016

This repository is a toolkit called "USTC-TK2016", which is used to parse network traffic (.pcap file). Besides, the dataset is "USTC-TFC2016".

  • The master branch can only run on Windows environment.
  • The ubuntu branch can run on Ubuntu Linux 16.04 LTS environment.

NOTICE: This repository credits to echowei/DeepTraffic


Installation

  1. Clone this repository on your machine
    # Clone the repository on "master" branch
    $ git clone -b master https://github.com/yungshenglu/USTC-TK2016
  2. Install the required packages via the following command
    # Run the command at the root of the repository
    $ pip3 install -r requirements.txt

Execution

NOTICE: You are on the master branch now!

  1. Download the traffic dataset USTC-TFC2016 and put it into the directory 1_Pcap\
    • You can download the traffic dataset USTC-TFC2016 from my another repository.
  2. Open the PowerShell and run 1_Pcap2Session.ps1 (take a few minutes)
    • To split the PCAP file by each session, please make sure the line 10 and 14 in 1_Pcap2Session.ps1 is uncommented and make line 11 and 15 is in comment.
    • To split the PCAp file by each flow, please make sure the line 11 and 15 in 1_Pcap2Session.ps1 is uncommented and make line 10 and 14 is in comment.
    • Run 1_Pcap2Session.ps1
      # Make sure your current directory is correct
      PS> .\1_Pcap2Session.ps1
    • If succeed, you will see the following files (folders) in folder 2_Session\
      • AllLayers\
      • L7\
  3. Run 2_ProcessSession.ps1 (take a few minutes)
    # Make sure your current directory is correct
    PS> .\2_ProcessSession.ps1
    • If succeed, you will see the following files (folders) in folder 3_ProcessedSession\
      • FilteredSession\ - Get the top 60000 large PCAP files
      • TrimedSession\ - Trim the filtered PCAP files into size 784 bytes (28 x 28) and append 0x00 if the PCAP file is shorter than 784 bytes
      • The files in subdirectory Test\ and Train\ is random picked from dataset.
  4. Run 3_Session2Png.py (take a few minutes)
    # Make sure your current directory is correct
    PS> python3 3_Session2png.py
    • If succeed, you will see the following files (folders) in folder 4_Png\
      • Test\ - For testing
      • Train\ - For training
  5. Run 4_Png2Mnist.py (take a few minutes)
    # Make sure your current directory is correct
    PS> python3 4_Png2Mnist.py
    • If succeed, you will see the the training datasets in folder 5_Mnist\
      • train-images-idx1-ubyte
      • train-images-idx3-ubyte
      • train-images-idx1-ubyte.gz
      • train-images-idx3-ubyte.gz

Contributor

NOTICE: You can follow the contributing process CONTRIBUTING.md to join me. I am very welcome any issue!


License

Mozilla Public License Version 2.0

ustc-tk2016's People

Contributors

yungshenglu avatar oxaliscu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.