GithubHelp home page GithubHelp logo

cutlass-gpgpu-sim's People

Contributors

aamirraihan avatar aamodt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

cutlass-gpgpu-sim's Issues

Cutlass-test goes into deadlock

Hello,
I have built the executable file 'cutlass-test', but the program would go into deadlock when I run it.
like this:

GPGPU-Sim PTX: cudaLaunch for 0x0x40febc (mode=performance simulation) on stream 0
GPGPU-Sim PTX: pushing kernel '_ZN7cutlass4gemm11gemm_kernelINS0_4GemmINS0_14WmmaGemmTraitsILNS_12MatrixLayout4KindE1ELS5_0ENS_5ShapeILi32ELi16ELi16ELi1EEEfNS0_13LinearScalingIfNS0_19FragmentMultiplyAddIfEEEEfS7_NS6_ILi16ELi16ELi16ELi1EEELi8ELi8EiNS0_20WmmaGemmTraitsHelperILS5_1ELS5_0ES7_ffSB_S7_SC_Li8ELi8EiEEEEEEEEvNT_6ParamsE' to stream 0, gridDim= (1,1,1) blockDim = (32,1,1)
GPGPU-Sim uArch: Shader 0 bind to kernel 1 '_ZN7cutlass4gemm11gemm_kernelINS0_4GemmINS0_14WmmaGemmTraitsILNS_12MatrixLayout4KindE1ELS5_0ENS_5ShapeILi32ELi16ELi16ELi1EEEfNS0_13LinearScalingIfNS0_19FragmentMultiplyAddIfEEEEfS7_NS6_ILi16ELi16ELi16ELi1EEELi8ELi8EiNS0_20WmmaGemmTraitsHelperILS5_1ELS5_0ES7_ffSB_S7_SC_Li8ELi8EiEEEEEEEEvNT_6ParamsE'

GPGPU-Sim uArch: CTA/core = 25, limited by: regs
GPGPU-Sim: Reconfigure L1 cache in Volta Archi to 120KB

GPGPU-Sim uArch: ERROR ** deadlock detected: last writeback core 2064 @ gpu_sim_cycle 1568 (+ gpu_tot_sim_cycle 4294917296) (48432 cycles ago)

GPGPU-Sim uArch: DEADLOCK  shader cores no longer committing instructions [core(# threads)]:
GPGPU-Sim uArch: DEADLOCK  0(32) 1(0)> Re-run the simulator in gdb and use debug routines in .gdbinit to debug this
Aborted (core dumped)

My cuda version is 9.1:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

The gpgpu-sim 4.0 have installed successfully.

GPGPU-Sim version 4.0.0 (build gpgpu-sim_git-commit-d26501be3e3e8a6fe52409dd7cbaa7fa33e34d5e-modified_0.0) configured with GPUWattch.

----------------------------------------------------------------------------
INFO - If you only care about PTX execution, ignore this message. GPGPU-Sim supports PTX execution in modern CUDA.
If you want to run PTXPLUS (sm_1x SASS) with a modern card configuration, the apps and simulator must be compiled with CUDA 4.2.
You can still run a PASCAL configuration when compiling with 4.2 by setting the $PTXAS_CUDA_INSTALL_PATH directory environment variable.
The following text describes why:
If you are using PTXPLUS, only sm_1x is supported and it requires that the app and simulator binaries are compiled in CUDA 4.2 or less.
The simulator requires it since CUDA headers desribe struct sizes in the exec which change from gen to gen.
The apps require 4.2 because new versions of CUDA tools have dropped parsing support for generating sm_1x
When running using modern config (i.e. pascal) and PTXPLUS with CUDA 4.2, the $PTXAS_CUDA_INSTALL_PATH env variable is required to get proper register usage
(and hence occupancy) using a version of CUDA that knows the register usage on the real card.

----------------------------------------------------------------------------
setup_environment succeeded

And I also copy configs in the path.

I have seen The deadlock situation in gpgpu-sim 3.0 in my program with bugs. And I notice the gpgpu-sim updates frequently in recent one month. I doubt whether it is related to the gpgpu-sim version.

Could you tell me more detail about the configuration about this program. like, the gpgpu-sim Commits ID you based on or other information.

Thanks !!!

setup issue

Hi there,

I have successfully run gpgpu-sim, and followed the steps to run cutlass-test with gpgpu-sim, but I got the output like this:

GPGPU-Sim PTX: __cudaRegisterFatBinary, fat_cubin_handle = 2, filename=default
GPGPU-Sim PTX: __cudaRegisterFunction _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memcpy_3d_deviceImLi1ELi1ELi1EEvPKhPhT_S3_S3_S3_S3_S3_S3_jjjjjjjjS3_S1_S2 : hostFun 0x0x4017f0, fat_cubin_handle = 2
GPGPU-Sim PTX: Parsing cutlass-test.1.sm_70.ptx
GPGPU-Sim PTX: finished parsing EMBEDDED .ptx file cutlass-test.1.sm_70.ptx
GPGPU-Sim PTX: loading globals with explicit initializers...
GPGPU-Sim PTX: finished loading globals (0 bytes total).
GPGPU-Sim PTX: loading constants with explicit initializers... done.
GPGPU-Sim PTX: Loading PTXInfo from cutlass-test.1.sm_70.ptx
Warning: cannot find deviceFun _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memcpy_3d_deviceImLi1ELi1ELi1EEvPKhPhT_S3_S3_S3_S3_S3_S3_jjjjjjjjS3_S1_S2
GPGPU-Sim PTX: __cudaRegisterFunction _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi1EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0 : hostFun 0x0x405dc0, fat_cubin_handle = 2
Warning: cannot find deviceFun _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi1EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0
GPGPU-Sim PTX: __cudaRegisterFunction _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi0EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0 : hostFun 0x0x405fe0, fat_cubin_handle = 2
Warning: cannot find deviceFun _nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37__Z16memset_3d_deviceIjLi0ELi0ELi0EEvPhhjT_S1_S1_S1_S1_jjjjjjjS1_S0
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693160; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel32; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel32
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel32 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x6931a0; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel64; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel64
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_set_kernel64 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x6931e0; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel32; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel32
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel32 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693220; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel64; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel64
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 64 bytes
GPGPU-Sim PTX registering constant __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cpy_kernel64 (64 bytes) to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x692640; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorTableArr; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorTableArr
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 1992 bytes
GPGPU-Sim PTX registering global __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorTableArr hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693140; deviceAddress = cudartErrorTable; deviceName = cudartErrorTable
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 8 bytes
GPGPU-Sim PTX registering global cudartErrorTable hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x409000; deviceAddress = cudartErrorTableEntryCount; deviceName = cudartErrorTableEntryCount
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 4 bytes
GPGPU-Sim PTX registering global cudartErrorTableEntryCount hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x409020; deviceAddress = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorCnpMapArr; deviceName = __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorCnpMapArr
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 104 bytes
GPGPU-Sim PTX registering global __nv_static_79__66_tmpxft_00002dac_00000000_12_cuda_device_runtime_compute_70_cpp1_ii_8b1a5d37_cudartErrorCnpMapArr hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693148; deviceAddress = cudartErrorCnpMap; deviceName = cudartErrorCnpMap
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 8 bytes
GPGPU-Sim PTX registering global cudartErrorCnpMap hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x409004; deviceAddress = cudartErrorCnpMapEntryCount; deviceName = cudartErrorCnpMapEntryCount
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 4 bytes
GPGPU-Sim PTX registering global cudartErrorCnpMapEntryCount hostVar to name mapping
GPGPU-Sim PTX: __cudaRegisterVar: hostVar = 0x693150; deviceAddress = CNPRT_VERSION_NUMBER; deviceName = CNPRT_VERSION_NUMBER
GPGPU-Sim PTX: __cudaRegisterVar: Registering const memory space of 4 bytes
GPGPU-Sim PTX registering global CNPRT_VERSION_NUMBER hostVar to name mapping
GPGPU-Sim: *** exit detected ***

Could you tell me how to solve it? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.