GithubHelp home page GithubHelp logo

About GPU utilization and speed about byteps HOT 5 OPEN

bytedance avatar bytedance commented on August 22, 2024
About GPU utilization and speed

from byteps.

Comments (5)

ymjiang avatar ymjiang commented on August 22, 2024

Here are some performance tuning tips: #68 (comment).

For your case, you need to use more servers. For example, you can use 2 physical machines and put 4 server instances on each physical machine (8 server instances in total). Since you have many identical nodes, I think the resources are quite sufficient for you. Note that the scheduler does not affect the training performance, so you can put it anywhere you like (e.g., colocate it with servers).

from byteps.

CIDWLY avatar CIDWLY commented on August 22, 2024

No matter how I change the properties of each node, the computational efficiency has not improved. I checked that the communication only supports 1GBytes. Is the bottleneck of communication affecting efficiency? Thank you
修改
修改2

from byteps.

ymjiang avatar ymjiang commented on August 22, 2024

It seems that your network bandwidth is too low. In our tests, the bandwidth is about 20Gbps.

Is it possible for you to try larger bandwidth?

from byteps.

bobzhuyb avatar bobzhuyb commented on August 22, 2024

@CIDWLY It seems that you only have 1Gbps network interface card hardware. In that case, BytePS obviously can't help you get >1Gbps network bandwidth.

However, BytePS can help you better utilize the bandwidth you have than other distributed training implementation, like MPI. If you compare BytePS with, e.g., Horovod, you should see the advantage.

from byteps.

CIDWLY avatar CIDWLY commented on August 22, 2024

There are some difficulties in upgrading our bandwidth. As far as you know, what models with small enough parameters can make full use of GPU in our current experimental environment? Or is there a parameter compression method applicable in BytePs?

Thank you

from byteps.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.