I don't see any parameter allowing the user to specify a remote DeepSpeed server to ta

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Question : How to query a remote DeepSpeed server? about deepspeed-mii HOT 4 OPEN

microsoft commented on July 21, 2024

Question : How to query a remote DeepSpeed server?

from deepspeed-mii.

Comments (4)

rusenask commented on July 21, 2024 1

Hi @mrwyattii, I think the problem with the grpc client is this

DeepSpeed-MII/mii/server_client.py

Line 70 in 79b56af

assert self.num_gpus > 0, "GPU count must be greater than 0"

- it's checking whether GPU is available when in fact it's on the client side so it shouldn't care :) Commenting that part out makes it work with the remote GRPC server

from deepspeed-mii.

mrwyattii commented on July 21, 2024

If I understand, you are trying to stand up a DeepSpeed-MII GRPC server and then send queries to that server remotely. Is that correct?

As for the second question, we do not currently support loading/unloading models at query time. You might be able to achieve this by using mii.terminate(old_deployment_name) and then mii.deploy(new_deployment_name, ...) when you detect that the query you are running does not match the current deployment.

from deepspeed-mii.

Thytu commented on July 21, 2024

If I understand, you are trying to stand up a DeepSpeed-MII GRPC server and then send queries to that server remotely. Is that correct?

In my understanding DeepSpeed is both an optimisation solution and an inference engine (cf).

Is there a way, using DeepSpeed-MII, to have the DeepSpeed engine in a remote server and specify to the client the target host/post?

As for the second question, we do not currently support loading/unloading models at query time. You might be able to achieve this by using mii.terminate(old_deployment_name) and then mii.deploy(new_deployment_name, ...) when you detect that the query you are running does not match the current deployment.

Do you plan to handle this feature? I would be happy to help to implement this feature if needed 😉
(Using mii.terminate(old_deployment_name) and then mii.deploy(new_deployment_name, ...) would be really slow)

from deepspeed-mii.

mrwyattii commented on July 21, 2024

Sorry the late reply on this. There is a way to use MII on a remote server and have a client send queries and receive a response. This functionality right now is currently limited to AML deployments. You could probably adapt what we are doing in the AML deployment docker image to achieve this functionality yourself:

DeepSpeed-MII/mii/aml_related/templates.py

Line 105 in 747072b

dockerfile = \

I realize you have asked about providing a Docker image for MII previously (#83). I've been working on automatically generating up-to-date images that we can share on DockerHub and on Azure Marketplace to make deploying MII easier. I think we could also bundle some things from our AML docker builds to enable this remote server capability on non-AML deployments.

Do you plan to handle this feature? I would be happy to help to implement this feature if needed 😉

We currently do not have plans to add this feature, but we are always open to outside contributions. Is the main goal here to have a persistent GRPC server that can swap which model it's running on the fly?

from deepspeed-mii.

Question : How to query a remote DeepSpeed server? about deepspeed-mii HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs