GithubHelp home page GithubHelp logo

3ntr0py's Introduction

Utilizing CUDA + Numba to calculate entropy.

Normally entropy is calculated using the solution below. The Numba + CUDA solution is around 10% faster than this for single file and up to 300 times faster for multiple files (on my equipment - NVIDIA 3060).

from scipy.stats import entropy
import numpy as np

def entropy(labels, base=None):
  labels = np.frombuffer(labels, dtype=np.uint8)

  value,counts = np.unique(labels, return_counts=True)
  return entropy(counts, base=2)

Still in development

Goal

Quickly calculate entropy of over 200k (110 GB) malware samples without using any CPU multiprocessing. It took 10522.091444253922 seconds to complete the processing of all 200k malware samples (110GB). The malware was stored on a network attached storage, which has greatly impacted the I/O performance.

By applying CPU multiprocessing I was able to maximize usage of my computer resources and process the 110 GB in around 3600 seconds (1 hour). The data was stored on the network attached storage.

Testing

Currently, tests cannot be performed on the Github actions as there is no Nvidia GPU available. If it will be possible, I will create a self-hosted runner in the future.

Remarks

Code is not optimized and cleaned yet.

3ntr0py's People

Contributors

3nthusia5t avatar

Watchers

 avatar

3ntr0py's Issues

Upload samples, benchmark

  • Create samples directory
  • Create benchmark comparing to some common solutions and various numbers of files.

Create tests

  • Create tests to ensure that entropy is estimated properly.

Write proper sum_array function instead of using np.sum()

The idea for sum_array is to limit the sequencial number of operations to minimum.

Example illustrating the idea:

[ 1, 1, 1, 1, 0, 0]

will be changed to

sum [1, 1] - thread 0
sum [1, 1] - thread 1
sum [0, 0] - thread 2

those calculations result in [2, 2, 0], which then:

sum [2, 2] - thread 0
sum [0] - thread 1

Result: [4, 0]

sum [4, 0] - thread 0 which results in 4

Conclusion:

In best case scenario that would double the performace of numpy sum() function, which sum array sequentially. The CUDA solution can be slower for arrays with 3 elements, because of the overhead (for example copying memory host <-> device ). Despite that fact, I think it may be beneficial for the performance, since the goal is to process file samples, which are at least 97 bytes (http://www.phreedom.org/research/tinype/).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.