GithubHelp home page GithubHelp logo

keyxuliang / gemm_optimization Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mz24cn/gemm_optimization

0.0 1.0 0.0 89.15 MB

The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor's hardwares/OS. Out-of-the-box easy as MSVC, MinGW, Linux(CentOS) x86_64 binary provided. 在不同矩阵大小/硬件/操作系统下比较几个BLAS库的sgemm函数性能,提供binary,开盒即用。

License: MIT License

C++ 46.99% C 52.12% Makefile 0.08% HTML 0.81%

gemm_optimization's Introduction

gemm(matrix multiplication) optimization 矩阵乘法优化

The repository targets the gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor's hardwares/OS. Out-of-the-box easy as MSVC, MinGW, Linux(CentOS) x86_64 binary provided.
在不同矩阵大小/硬件/操作系统下比较几个BLAS库的sgemm函数性能,提供binary,开盒即用。

Some results 部分结果

GPU device GTX1080 (409632) * (409632) * (4096~32) on Windows
GPU device GTX1050Ti (204832) * (204832) * (2048~32) on Windows
GPU device R9 290X (204832) * (204832) * (2048~32) on Windows

How to Build

The repository contains an eclipse CDT project, a Microsoft Visual Studio VC project, and a Linux Makefile. Some package include file and binary library files are included. But it may be incomplete (for example, some Intel MKL runtime libraries for some CPU types). I think it is not difficult to solve such issues for the people who cares gemm optimization.

How to Run

.\gemm_optimization.exe /1 :clblast 1 :clblas 1 :cublas 1 :mkl 1 :verify 1 :json D:\GTX1050Ti_Windows.json :M 2048 :N 2048 :K 2048 :step 2
This command line indicates the gemm computing on OpenCL device no. 1, clblast, clblas, NVIDIA cublas, Intel MKL enabled, data correction verification enabled, output data as json file 'D:\GTX1050Ti_Windows.json', the matrix multiplication computing starts from size A[2048[2048] * B[2048][2048], each dimension step down with factor 2 (2048, 1024, 512, ..., etc.).

gemm_optimization's People

Contributors

mz24cn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.