The goal of this project is to be forked to serve as a base for anyone needing to benchmark several Cuda functions quickly.
- Cuda Toolkit
- C++ compiler (g++ for linux, MSVC for Windows)
- GPU supported by CUDA
- CMake
- Conan
These libraries are included in the conan file. Do not install them yourself. Conan will do the job for you.
- To build, execute the following commands :
mkdir build && cd build
conan install ..
cmake ..
make
cd build
./bin/Bench
- Additionnaly you can use
--build=missing
to build missing libraries:
conan install .. --build missing
-
By default the program will run in release when it's inside a
build
orbuild_release
folder. To build in debug, build the projet inside abuild_debug
folder. -
You can specify the "--no-check" option when running the bench binary to disable result checking :
./bin/Bench --no-check
There are two ways to get your code compiled:
- Directly copy the function in the file
src/to_bench.cu
- Create a new cmake target
In CMakeLists.txt
:
- Create a new library as follows:
add_library(LIB_NAME
SOURCE_FILE1
SOURCE_FILE2
...
)
- Link this library to the
Bench
target (add this library among the others) as follows:
target_link_libraries(Bench LIB_NAME async_memcpy GTest::GTest benchmark::benchmark TestHelpers)
- Note: be careful not to have the same functions name for exported functions between libraries (avoid multiple definition compilation error)
In bench/main.cc
:
- Include the header file containing the function(s) to bench
#include header_file.cuh
if not already in the included files - Define a new bench as follows
BENCHMARK_DEFINE_F(Fixture, BENCH_NAME)
(benchmark::State &st)
{
this->bench(st, NAME_OF_THE_FUNCTION_TO_BENCH, BUFFER_SIZE, FUNCTION_ARGS...);
}
- Register the new bench as follows
BENCHMARK_REGISTER_F(Fixture, BENCH_NAME)
->UseRealTime()
->Unit(benchmark::kMillisecond);
- Note: a template of this steps can directly be found in
src/main.cc
- Note: the first function argument must be a
cuda_tools::host_shared_ptr<int>
- You can use premade host_shared_ptr to allocate data
- You can use premade test_helper to test your result