xorbitsai / xoscar Goto Github PK
View Code? Open in Web Editor NEWPython actor framework for heterogeneous computing.
Home Page: https://xoscar.dev
License: Apache License 2.0
Python actor framework for heterogeneous computing.
Home Page: https://xoscar.dev
License: Apache License 2.0
When I use pip install xoscar
and try to import xoscar.collective
on Mac, there will be an error that ImportError: dlopen(/Users/liuyibin/miniconda3/lib/python3.10/site-packages/xoscar/collective/xoscar_pygloo.cpython-310-darwin.so, 0x0002): symbol not found in flat namespace '_uv_async_init'
To help us to reproduce this bug, please provide information below:
A clear and concise description of what you expected to happen.
Add any other context about the problem here.
n_io_process
can not be seen from Xoscar side, it's belong to Xorbits. it should be removed in this repo.
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
We can refer to Ray
to use spdlog
(https://github.com/gabime/spdlog).
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Now Xoscar need a way to start with command line.
A clear and concise description of what you want to happen.
We can inherit the option from Xorbits.
CUDA_VISIBLE_DEVICES
is frequently used for cuda device management, and currently, this environment variable can be assigned when setting up actor pools, resulting in fixed cuda device allocation.
Nonetheless, there are situations where dynamic cuda device allocation is beneficial. For instance, when performing inference on models of varying sizes, the required number of cuda devices may differ. To optimize cuda device utilization effectively, it is advantageous to determine CUDA_VISIBLE_DEVICES
at runtime based on the model size.
Support adding and removing subpools.
Started playing with the library, please correct me if I'm wrong but state should not change in Stateless actor and following ideally should not work like that or at least should warn in runtime.
To help us to reproduce this bug, please provide information below:
import asyncio
import xoscar as xo
import nest_asyncio
nest_asyncio.apply()
class Counter(xo.StatelessActor): # <-- intentional to check it there's at least a runtime warning!
count = 0
def inc(self):
self.count += 1
print(self.count)
async def main():
address = "localhost:9999"
await xo.create_actor_pool(address=address, n_process=1)
actor = await xo.create_actor(
Counter,
address=address,
uid="1",
)
tasks = [actor.inc() for _ in range(10)]
await asyncio.gather(*tasks)
await xo.destroy_actor(actor)
asyncio.run(main())
outputs
1
2
3
4
5
6
7
8
9
10
A clear and concise description of what you expected to happen.
Add any other context about the problem here.
A clear and concise description of what the bug is.
tuptools_init_.py", line 16, in
import setuptools.version
File "C:\Users\AppData\Local\Temp\pip-build-env-5c2u67or\overlay\Lib\site-packages\se
tuptools\version.py", line 1, in
import pkg_resources
File "C:\Users\AppData\Local\Temp\pip-build-env-5c2u67or\overlay\Lib\site-packages\pk
g_resources_init_.py", line 2191, in
register_finder(pkgutil.ImpImporter, find_on_path)
^^^^^^^^^^^^^^^^^^^
AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
To help us to reproduce this bug, please provide information below:
A clear and concise description of what you expected to happen.
Add any other context about the problem here.
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
0.0.5 uses TMPDIR to create socket file.
However, Mac TMPDIR is like /var/folders/k5/bxy40w394cz_z73k7f69bc2h0000gn/T/
and it's too long for uvlook to create socket file.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Collective communication is widely used for deep learning workload, we can add support for it.
test_copy_to_file_objects sometimes failed in CI.
To help us to reproduce this bug, please provide information below:
__________________________ test_copy_to_file_objects ___________________________
@pytest.mark.asyncio
async def test_copy_to_file_objects():
start_method = (
os.environ.get("POOL_START_METHOD", "forkserver")
if sys.platform != "win32"
else None
)
pool = await create_actor_pool(
"127.0.0.1",
pool_cls=MainActorPool,
n_process=2,
subprocess_start_method=start_method,
)
d = tempfile.mkdtemp()
async with pool:
ctx = get_context()
# actor on main pool
actor_ref1 = await ctx.create_actor(
FileobjTransferActor,
uid="test-1",
address=pool.external_address,
allocate_strategy=ProcessIndex(1),
)
actor_ref2 = await ctx.create_actor(
FileobjTransferActor,
uid="test-2",
address=pool.external_address,
allocate_strategy=ProcessIndex(2),
)
sizes = [10 * 1024**2, 3 * 1024**2, 0.5 * 1024**2, 0.25 * 1024**2]
names = []
for _ in range(2 * len(sizes)):
_, p = tempfile.mkstemp(dir=d)
names.append(p)
> await actor_ref1.copy_data(actor_ref2, names[::2], names[1::2], sizes=sizes)
xoscar/backends/test/tests/test_transfer.py:293:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
xoscar/backends/context.py:227: in send
return self._process_result_message(result)
xoscar/backends/context.py:102: in _process_result_message
raise message.as_instanceof_cause()
xoscar/backends/pool.py:657: in send
result = await self._run_coro(message.message_id, coro)
xoscar/backends/pool.py:368: in _run_coro
return await coro
xoscar/api.py:306: in __on_receive__
return await super().__on_receive__(message) # type: ignore
xoscar/core.pyx:527: in __on_receive__
raise ex
xoscar/core.pyx:497: in xoscar.core._BaseActor.__on_receive__
async with self._lock:
xoscar/core.pyx:498: in xoscar.core._BaseActor.__on_receive__
with debug_async_timeout('actor_lock_timeout',
xoscar/core.pyx:503: in xoscar.core._BaseActor.__on_receive__
result = await result
xoscar/backends/test/tests/test_transfer.py:239: in copy_data
fobj.write(np.random.bytes(size))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E TypeError: [address=127.0.0.1:45079, pid=6288] 'float' object cannot be interpreted as an integer
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
Ref: https://github.com/giampaolo/pysendfile/blob/master/test/benchmark.py
A clear and concise description of what the bug is.
Traceback (most recent call last):
File "/Users/codingl2k1/.pyenv/versions/3.11.4/lib/python3.11/site-packages/xoscar/backends/pool.py", line 1402, in monitor_sub_pools
await self.recover_sub_pool(address)
File "/Users/codingl2k1/.pyenv/versions/3.11.4/lib/python3.11/site-packages/xoscar/backends/indigen/pool.py", line 329, in recover_sub_pool
for _, message in self._allocated_actors[address].values():
RuntimeError: dictionary changed size during iteration
To help us to reproduce this bug, please provide information below:
Not easy to reproduce this error.
A clear and concise description of what you expected to happen.
Add any other context about the problem here.
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
Related #22
As there are some C++ code in this project, I hope there is a "build from source" part in the doc or readme. Thanks!
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
Xinference UT test_opt 4-bit can reproduce this error.
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
pip install xoscar
on a new Linux machine. And I found that there is no pygloo related so file under xoscar/collective
dir.
cc @YibinLiu666
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
Note that the issue tracker is NOT the place for general support. For
discussions about development, questions about usage, or any general questions,
contact us on https://discuss.xorbits.io/.
Sometimes, this log happens:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.