GithubHelp home page GithubHelp logo

nvidia / nsight-training Goto Github PK

View Code? Open in Web Editor NEW
103.0 103.0 28.0 97.98 MB

Training material for Nsight developer tools

License: Other

Makefile 0.77% Cuda 9.64% C 42.71% C++ 1.93% Python 1.69% Jupyter Notebook 10.57% Shell 0.29% Dockerfile 0.35% JavaScript 0.03% HTML 0.03% CSS 0.10% CMake 0.64% SuperCollider 31.25%

nsight-training's People

Contributors

fs-nv avatar gedoensmax avatar jmarusarz-nvidia avatar plabus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nsight-training's Issues

performance regression with TensorRT can anybody help understand nsight?

Hello everyone

I am working on a pytorch object tracking model to convert it to tensorrt for faster inference

When inferencing tensorrt with a single batch the model is about 2x faster, but when adding batches, it becomes SLOWER

batch of 1 inference time:
pytorch - 40ms
tensorrt - 20ms

batch of 8 inference time:
pytorch - 50ms
tensorrt - 85ms

When adding batches the inference time on pytorch pretty much doesn't increase, but inferencing time with tensorrt engine increases significantly!!!

Can anybody help, why is the speed regression happening?

I have exported the model with batch size 1 and batch size 4 to nsight systems:

1 batch inference test: https://drive.google.com/file/d/1achvISpSc1pvlV2RLfSNLxCLlRsZHcnT/view?usp=sharing
4 batch inference test: https://drive.google.com/file/d/1ZuHsO28LIlETNIcWk6lh7miv2Lovco9D/view?usp=sharing

Can anybody with experience in Nvidia NSight help me understand these graphs and compare them to know where and why the performance reggression is happening?

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.