GithubHelp home page GithubHelp logo

afar's People

Contributors

eriknw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

afar's Issues

Error when using `%%time` Jupyter magic

I found that when I run the following snippet in a JupyterLab cell

%%time

import afar

with afar.run, remotely:
    import dask.array as da
    x = da.arange(10)
    result = x.sum().compute()    

I get the following error when afar attempts to inspect the current frame:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<timed exec> in <module>

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/afar/core.py in __enter__(self)
    125                 raise RuntimeError("uh oh!")
    126             self.data = {}
--> 127         lines, offset = inspect.findsource(self._frame)
    128 
    129         while not lines[with_lineno].lstrip().startswith("with"):

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/inspect.py in findsource(object)
    833         lines = linecache.getlines(file)
    834     if not lines:
--> 835         raise OSError('could not get source code')
    836 
    837     if ismodule(object):

OSError: could not get source code

When I remove the %%time magic, no error is raised

Large object warning

When using

import afar

with afar.run, remotely:
    # Dask things here

today, I ended up triggering Dask's "Large object of size XXX detected in task graph" warning message (shown below):

/Users/james/projects/dask/distributed/distributed/worker.py:3862: UserWarning: Large object of size 1.00 MiB detected in task graph: 
  (<afar.core.MagicFunction object at 0x168b1c610>, ('result',), {})
Consider scattering large objects ahead of time
with client.scatter to reduce scheduler burden and 
keep data on workers

    future = client.submit(func, big_data)    # bad

    big_future = client.scatter(big_data)     # good
    future = client.submit(func, big_future)  # good
  warnings.warn(

Functionally this is totally fine and everything ran successfully. However, this big red warning may be scary to users (in particular, those new to Dask).

We might consider:

  1. Scattering the MagicFunction object ahead of time and passing the corresponding Future around
  2. Mention the "Large object of size XXX detected in task graph" warning in the README so folks are at least aware it's (potentially) expected

Send `print` statements back to the client

I'm using afar to move graph-generation of a large collection onto the cluster, so my slow internet doesn't slow things down so much. I've found myself wanting to print a few statistics about the collection before I compute it.

I could make all this work by reusing futures and maybe returning some strings, but in the spirit of afar's magic, it would be very intuitive if print just... printed.

Perhaps dask/distributed#5217 would make this possible?

EDIT: also must add, afar is amazing! What a joy to use.

IPython magic?

Python afars API today looks like:

with afar.run, remotely:
    import dask_cudf
    df = dask_cudf.read_parquet("s3://...")
    result = df.sum().compute()

This is great, but the syntax seems a bit magical. Particularly in IPython or Jupyter, using a %%afar magic might be a more natural fit for users:

%%afar

import dask_cudf
df = dask_cudf.read_parquet("s3://...")
result = df.sum().compute()

Perhaps @Carreau could provide some guidance on how we could turn with afar.run, remotely: into a magic command? I've got a feeling he's done this a time or two before ; )

[Bug] Afar not working with cudf/dask-cudf/gpu-numba

When trying to use afar with cudf, it's not working the problem seems to be with numba when using gpu (numba jit by itself has no problems)

Minimal reproducible example:

with afar.run, remotely:
    import cudf
    x = str(cudf)

x.result() retrieves

AttributeError: 'LocalPrint' object has no attribute '__name__'

after looking into the traceback

import traceback

traceback.print_tb(x.traceback())

I get the following:

  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/afar/_core.py", line 337, in run_afar
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/innerscope/core.py", line 492, in __call__
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/innerscope/core.py", line 541, in _call
  File "<afar>", line 2, in _afar_magic_
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cudf/__init__.py", line 4, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/cudf/utils/gpu_utils.py", line 18, in validate_setup
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/rmm/__init__.py", line 16, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/rmm/mr.py", line 2, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/rmm/_lib/__init__.py", line 3, in <module>
  File "rmm/_lib/device_buffer.pyx", line 1, in init rmm._lib.device_buffer
  File "rmm/_cuda/stream.pyx", line 26, in init rmm._cuda.stream
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/__init__.py", line 39, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/decorators.py", line 12, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/stencils/stencil.py", line 11, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/registry.py", line 4, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/dispatcher.py", line 16, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/compiler.py", line 6, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/callconv.py", line 12, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/base.py", line 24, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/cpython/builtins.py", line 511, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/typing/builtins.py", line 22, in <module>
  File "/opt/conda/envs/rapids/lib/python3.8/site-packages/numba/core/typing/templates.py", line 1157, in register_global

I'm using numba=0.53 which line 1157 corresponds to the following https://github.com/numba/numba/blob/release0.53/numba/core/typing/templates.py#L1157-L1159

Which is trying to get .__name__ from LocalPrint

Looking at this all issue numpy/numpy#3112 (comment) it seems that in the numba end (https://github.com/numba/numba/blob/97fe221b3704bd17567b57ea47f4fc6604476cf9/numba/core/typing/templates.py#L1161-L1163) they are not updating the wraper as suggested here https://docs.python.org/3.3/library/functools.html#functools.update_wrapper, meaning that it is not assigning __name__ to LocalPrint when it gets there.

I wonder if there is a way of fixing this on afar maybe on LocalPrint().

Repr of final expression no longer displayed in Jupyter with afar 0.4.0

I enjoyed this feature (as seen on read-me!)

with afar.run, remotely:
    three + seven
# displays 10!

But with the newest afar version, I only see

✨ Running afar... ✨

Not much bother, since I can use

with afar.get, remotely:
    result = three + seven
result

but it was handy while available.

Option to not return Futures

with afar.run, remotely: creates Futures at the end of the context.

Is there a nice way to specify to run the code on a dask worker, and then copy the results locally? In other words, the result is future.result() instead of future.

Options:

  • use a different function/verb, such as with afar.get, remotely:
  • use a different adverb, such as with afar.run, lovingly:
  • use an argument, such as with afar.run(get=True), remotely:

What's the most clear?

Connection error with TLS

I sometimes get this error when running things on Coiled (or I suspect any cluster with TLS set up). When I run things again then it connects just fine ?

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-512e4f56931d> in <module>
----> 1 df2.result()

~/workspace/distributed/distributed/client.py in result(self, timeout)
    224         if self.status == "error":
    225             typ, exc, tb = result
--> 226             raise exc.with_traceback(tb)
    227         elif self.status == "cancelled":
    228             raise result

/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py in loads()

/opt/conda/lib/python3.8/site-packages/distributed/client.py in __setstate__()

/opt/conda/lib/python3.8/site-packages/distributed/worker.py in get_client()

/opt/conda/lib/python3.8/site-packages/distributed/client.py in __init__()

/opt/conda/lib/python3.8/site-packages/distributed/client.py in start()

/opt/conda/lib/python3.8/site-packages/distributed/utils.py in sync()

/opt/conda/lib/python3.8/site-packages/distributed/utils.py in f()

/opt/conda/lib/python3.8/site-packages/tornado/gen.py in run()

/opt/conda/lib/python3.8/site-packages/distributed/client.py in _start()

/opt/conda/lib/python3.8/site-packages/distributed/client.py in _ensure_connected()

/opt/conda/lib/python3.8/site-packages/distributed/comm/core.py in connect()

/opt/conda/lib/python3.8/asyncio/tasks.py in wait_for()

/opt/conda/lib/python3.8/site-packages/distributed/comm/tcp.py in connect()

/opt/conda/lib/python3.8/site-packages/distributed/comm/tcp.py in _get_connect_args()

/opt/conda/lib/python3.8/site-packages/distributed/comm/tcp.py in _expect_tls_context()

TypeError: TLS expects a `ssl_context` argument of type ssl.SSLContext (perhaps check your TLS configuration?)  Instead got None

I don't have great context here, but I'm running code like this

with afar.run, remotely:
    df2 = df.set_index("x", shuffle="service2").persist()

When using Afar one worker stops getting tasks.

I was running a workflow using Afar on Coiled, and I noticed that the Afar version at a moment had a worker that stopped receiving tasks. Notice that in the task stream on the performance reports the Afar version, the last thread stops having tasks while in the non-afar version this doesn't happen. Is this the expected behavior? what is actually happening in here?

Note: the data is public so this should work as a reproducible example.

Workflow without afar:

ddf = dd.read_parquet(
    "s3://coiled-datasets/timeseries/20-years/parquet",
    storage_options={"anon": True, "use_ssl": True},
    split_row_groups=True,
    engine="pyarrow",
)

with performance_report(filename="read_pq_groupby_mean_CPU_pyarrow.html"):
    ddf.groupby('name').x.mean().compute()

Link to performance report

Workflow with afar

%%time
with afar.run, remotely:
    ddf_cpu = dd.read_parquet(
        "s3://coiled-datasets/timeseries/20-years/parquet",
        storage_options={"anon": True, "use_ssl": True},
        split_row_groups=True,
        engine="pyarrow",
        )
    
    res = ddf_cpu.groupby('name').x.mean().compute()

with performance_report(filename="read_pq_groupby_mean_CPU_pyarrow_afar.html"):
    res.result()

Link to performance report

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.