GithubHelp home page GithubHelp logo

lhd1029 / dlop-bench Goto Github PK

View Code? Open in Web Editor NEW

This project forked from deeplink-org/dlop-bench

0.0 0.0 0.0 9.5 MB

A benchmark suited especially for deep learning operators

License: Apache License 2.0

Shell 0.06% Python 99.93% Makefile 0.01%

dlop-bench's Introduction

Introduction

DLOP-Bench is an open-source benchmark suite for deep learning operators. It has the following three major features:

  • Operators at the deep learning framework level

We focus on the operator at the deep learning framework level (such as torch.convolution) and do not dive into the implementation details of each operator (implicit gemm implementation or winograd implementation and the related algorithm selection). One can easily benchmark the operators on a certain AI accelerator as long as they finish the adaption on a deep learning framework.

  • Basic operators and domain-specific long-tail operators

Besides basic operators like convolution, pooling, and normalization, we also collect many representative domain-specific operators mainly from object detection, instance segmentation, and other computer vision directions in OpenMMLab. These operators have no dedicated implementation of deep learning accelerators and have to resort to the Python interpreter. As such, they will always be broken down into large numbers of basic operators. They incur a lot of function calls, as well as data transfer and context switching costs. We name them long-tail operators.

  • Benchmarking deep learning accelerators, frameworks, and compilers

From the operator level, this benchmark suite can provide a more microscopic assessment from multiple aspects, including accelerator hardware specifications, deep learning frameworks, and deep learning compilers.

Highlights

  • Execution framework. The main body is an execution engine, compatible with different deep learning frameworks (PyTorch, TensorFlow, JAX, and so on) with different execution modes, such as eager and graph mode.
  • 200+ basic operators. We collected the operators from models in OpenMMLab. The input information consists of two parts: input tensor shape and attributes information. We run the models and record the input configurations of each operator. For each input configuration, we save them in CSV format for evaluation.
  • 100+ long-tail samples. It has collected 100+ long-tail samples from different deep learning models with representative syntax features, mainly from OpenMMLab, see samples for more detail.

Getting Started Instruction

First, download the latest source code:

git clone https://github.com/OpenComputeLab/DLOP-Bench.git

To show the structure of source code, we can use the following command:

cd DLOP-Bench
tree -d -L 1 ./bench

The implementation functions of basic and long tail operators are located in ./bench/samples/.

Dependencies

The code is tested under Python 3, with different deep learning frameworks (PyTorch, TensorFlow, JAX, and so on). You can select a specific version of the framework according to the version of CUDA/cuDNN. For more details please refer to their official websites.

Some samples are dependent on OpenCV2.

pip install opencv-python
pip install opencv-python-headless

Basic Operators

Here is a command demo that illustrates how you can use DLOP-Bench to test basic operators.

# config bench PYTHONPATH
cd DLOP-Bench
export PYTHONPATH=./bench:$PYTHONPATH
# If you want to test sample performance using torch backend, you can see the demo as follows:
# prepare pytorch environment, python 3 & torch 1.10 or 1.12 best
...
# run the operator abs using torch backend, more profiling results can refer to profiler_reulsts, reulsts, and time_reulsts
FRAMEWORK=torch python ./bench/api/api.py -c abs -st 1 
# run the operator abs and absBP using torch backend
FRAMEWORK=torch python ./bench/api/api.py -c abs,absBP -st 1
# get more usage information
FRAMEWORK=torch python ./bench/api/api.py --help

Long-tail Operators

From long-tail operators, this benchmark suite provides several stages to test their performance as below:

  • stage 1 : eager mode.
  • stage 2 : graph mode with jit.

This benchmark suite supports the execution of all long-tail operators in stage 1, while some operators fail to run in 2 because they are unsupported in the given deep learning compiler. Here is a command demo to test long-tail operators.

# run the operator bbox2delta using torch backend in eager mode
FRAMEWORK=torch python ./bench/api/api.py -c bbox2delta -st 1
# run the operator bbox2delta using torch backend in both eager mode and graph mode
FRAMEWORK=torch python ./bench/api/api.py -c bbox2delta -st 1,2
# run the operator bbox2delta and l2_loss using torch backend in both eager mode and graph mode
FRAMEWORK=torch python ./bench/api/api.py -c bbox2delta,l2_loss -st 1,2

These apis can also be used in backend torch, tensorflow, or xla, just set corresponding FRAMEWORK environment. While all the operators can be tested using torch backend, some operators may raise an AssertionError in other backends if their corresponding implementation codes have not been added yet. You can wait for our update or add the codes yourself.

If you want to test sample performance using tensorflow, or XLA backend, you can see the demo as follows:

# prepare tensorflow environment
...
# run the operator bbox2offset using tf backend in eager mode
FRAMEWORK=tf TF_XLA_FLAGS=--tf_xla_auto_jit=2 XLA_FLAGS=--xla_gpu_cuda_data_dir=.../cuda-10.1 python ./bench/api/api.py -c bbox2offset -st 1
# run the operator bbox2offset using tf backend in both eager mode and graph mode
FRAMEWORK=tf TF_XLA_FLAGS=--tf_xla_auto_jit=2 XLA_FLAGS=--xla_gpu_cuda_data_dir=.../cuda-10.1 python ./bench/api/api.py -c bbox2offset -st 1,2

How to add a new operator

  • Create a folder named after the operator in the ./bench/samples/basic directory
  • Copy the json file of the operator parameter information table generated by the operator acquisition module into the folder
  • Create __init__.py and torch_impl.py files, if you need to test other framework operators, you can refer to torch_impl.py In __init__.py, you need to implement two functions get_sample_config and gen_np_args, and then register the two functions using register_sample. In torch_impl.py you need to implement the function args_adaptor, which performs data preparation and the operator definition you are going to add. Then, executor_creator function is needed to register the above two functions into the benchmark.

dlop-bench's People

Contributors

xup16 avatar lixiuhong avatar lhd1029 avatar frshiyi avatar reinerzhou avatar cycle1024 avatar xudianhong avatar adamantboy avatar jimyma avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.