Comments (5)
when running: python mii-sd.py
a_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'
[2022-11-27 11:35:16,846] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 6581
[2022-11-27 11:35:16,846] [ERROR] [launch.py:324:sigkill_handler] ['/opt/conda/bin/python', '-m', 'mii.launch.multi_gpu_server', '--task-name', 'text-to-image', '--model', 'CompVis/stable-diffusion-v1-4', '--model-path', '/tmp/mii_models', '--port', '50050', '--ds-optimize', '--provider', 'diffusers', '--config', 'eyJ0ZW5zb3JfcGFyYWxsZWwiOiAxLCAicG9ydF9udW1iZXIiOiA1MDA1MCwgImR0eXBlIjogImZwMTYiLCAiZW5hYmxlX2N1ZGFfZ3JhcGgiOiBmYWxzZSwgImNoZWNrcG9pbnRfZGljdCI6IG51bGwsICJkZXBsb3lfcmFuayI6IFswXSwgInRvcmNoX2Rpc3RfcG9ydCI6IDI5NTAwLCAiaGZfYXV0aF90b2tlbiI6ICJoZl9Xc0NwVWFFYVhMbGtEZEtLTkVtS2NxZk9vTHBjcWxXWHF5IiwgInJlcGxhY2Vfd2l0aF9rZXJuZWxfaW5qZWN0IjogdHJ1ZSwgInByb2ZpbGVfbW9kZWxfdGltZSI6IGZhbHNlLCAic2tpcF9tb2RlbF9jaGVjayI6IGZhbHNlfQ=='] exits with return code = 1
[2022-11-27 11:35:18,791] [INFO] [server_client.py:117:_wait_until_server_is_live] waiting for server to start...
Traceback (most recent call last):
File "/home/ec2-user/DeepSpeed-MII/examples/benchmark/txt2img/mii-sd.py", line 15, in
mii.deploy(task='text-to-image',
File "/opt/conda/lib/python3.9/site-packages/mii/deployment.py", line 114, in deploy
return _deploy_local(deployment_name, model_path=model_path)
File "/opt/conda/lib/python3.9/site-packages/mii/deployment.py", line 120, in _deploy_local
mii.utils.import_score_file(deployment_name).init()
File "/tmp/mii_cache/sd_deploy/score.py", line 29, in init
model = mii.MIIServerClient(task,
File "/opt/conda/lib/python3.9/site-packages/mii/server_client.py", line 92, in init
self._wait_until_server_is_live()
File "/opt/conda/lib/python3.9/site-packages/mii/server_client.py", line 115, in _wait_until_server_is_live
raise RuntimeError("server crashed for some reason, unable to proceed")
RuntimeError: server crashed for some reason, unable to proceed
from deepspeed-mii.
OK I've installed the latest AMI for deep learning with cuda 11.7
now I get the following when running python mii-sd.py:
/opt/conda/envs/pytorch/lib/python3.9/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/apply_rotary_pos_emb.cu:8:10: fatal error: cuda_profiler_api.h: No such file or directory
#include <cuda_profiler_api.h>
^~~~~~~~~~~~~~~~~~~~~
from deepspeed-mii.
I've switched to different AMI with pytorch 1.2 and cuda 1.6
and now I get the following error:
Time to load spatial_inference op: 17.237044095993042 seconds
**** found and replaced unet w. <class 'deepspeed.model_implementations.diffusers.unet.DSUNet'>
About to start server
Started
[2022-11-27 13:35:10,519] [INFO] [server_client.py:117:_wait_until_server_is_live] waiting for server to start...
[2022-11-27 13:35:15,524] [INFO] [server_client.py:117:_wait_until_server_is_live] waiting for server to start...
[2022-11-27 13:35:15,524] [INFO] [server_client.py:118:_wait_until_server_is_live] server has started on 50050
Traceback (most recent call last):
File "/home/ec2-user/DeepSpeed-MII/examples/benchmark/txt2img/mii-sd.py", line 23, in
results = pipe.query(prompts)
File "/opt/conda/envs/pytorch/lib/python3.9/site-packages/mii/server_client.py", line 367, in query
response = self.asyncio_loop.run_until_complete(
File "/opt/conda/envs/pytorch/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/opt/conda/envs/pytorch/lib/python3.9/site-packages/mii/server_client.py", line 263, in _query_in_tensor_parallel
await responses[0]
File "/opt/conda/envs/pytorch/lib/python3.9/site-packages/mii/server_client.py", line 313, in _request_async_response
response = await self.stubs[stub_id].Txt2ImgReply(req)
File "/opt/conda/envs/pytorch/lib/python3.9/site-packages/grpc/aio/_call.py", line 290, in await
raise _create_rpc_error(self._cython_call._initial_metadata,
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "Exception calling application: 'DSUNet' object has no attribute 'config'"
debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:50050 {grpc_message:"Exception calling application: 'DSUNet' object has no attribute 'config'", grpc_status:2, created_time:"2022-11-27T13:35:15.530649601+00:00"}"
from deepspeed-mii.
This was resolved recently. Please see #112 (comment)
from deepspeed-mii.
Please reopen if this issue is still not resolved.
from deepspeed-mii.
Related Issues (20)
- Can DeepSpeed-MII inference on multi gpus with only 1 replica? HOT 2
- Kernel execution error with long context length
- Workarounds for pre-Ampere devices HOT 1
- What is the exact meaning of forward tokens?
- Quantization inference HOT 2
- [NEED HELP] Quantization inference HOT 3
- On M3 Pro Macbook having issues with installation HOT 2
- qwen1.5 model Support? HOT 3
- [BUG] Issue serving Mixtral 8x7B on H100 HOT 9
- server crashed for some reason, unable to proceed HOT 1
- Cohere's Command-R model support HOT 1
- I can't tell from documentation if we're meant to use a chat template or if it's automatically implemented?
- Block when Call client inference in multiprocessing.Process HOT 3
- How can i use this library with langchain or llama_index? HOT 2
- inference_core_ops.so: undefined symbol: _Z19cuda_wf6af16_linearRN2at6TensorES1_S1_S1_S1_S1_iiii HOT 6
- Limit VRAM usage in serving the model HOT 2
- Any plans for produnction-ready services?
- Add support for DBRX
- [FEATURE REQUEST] Add Support for Qwen1.5-MoE Architecture in DeepSpeed-MII HOT 1
- how can I use deepspeed to split the model to submit GPU?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepspeed-mii.