GithubHelp home page GithubHelp logo

jupyter-server / pycrdt Goto Github PK

View Code? Open in Web Editor NEW
28.0 6.0 6.0 768 KB

CRDTs based on Yrs.

Home Page: https://jupyter-server.github.io/pycrdt

License: MIT License

Python 58.64% Rust 41.36%
crdt yjs

pycrdt's People

Contributors

davidbrochart avatar jbdyn avatar kloczek avatar patrick91 avatar pre-commit-ci[bot] avatar zsailer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pycrdt's Issues

Text data type does not delete as expected

Description

The Text data type does not delete characters when a slice's start index is equal to the length of the given slice.

Reproduce

from pycrdt import Doc, Text
doc = Doc()
doc["text"] = text = Text()
text += "test"
print(text)   # prints 'test'
del text[2:4] # slice length = 4 - 2 = 2 == start, alternatively `del text[2:]`, 
print(text)   # prints 'test' again, but should print 'te'
del text[1:4] # slice length = 4 - 1 = 3 != start, alternatively `del text[1:]`
print(text)   # prints 't' as expected

Expose functionality to manager Y.Doc updates without instantiating Y.Doc (diffUpdate, mergeUpdates, encodeStateVectorFromUpdate)

Problem

I am working on a platform that has centralized storage for the YDoc. As the centralized server does not need to know anything about the YDoc there is really no point in instantiating a YDoc object and I could get away with only using the encodeStateVectorFromUpdate, diffUpdate, mergeUpdates methods (https://docs.yjs.dev/api/document-updates#example-syncing-clients-without-loading-the-y.doc).

Currently neither https://github.com/y-crdt/ypy nor this project expose those methods. I think it would be rather handy dandy if these basic buffer manipulation methods would be exposed. That way users can simply use those.

Proposed Solution

Expose the update API (in both v1 and v2) format https://github.com/yjs/yjs?tab=readme-ov-file#update-api that work directly on buffers(so mergeUpdates, encodeStateVectorFromUpdate, diffUpdate, convertUpdateFormatV1ToV2, convertUpdateFormatV2ToV1, mergeUpdatesV2, encodeStateVectorFromUpdateV2, diffUpdateV2),

Additional context

I have been following the "separation" between pycrdt and ypy and I do think that this issue may be more applicable to https://github.com/y-crdt/ypy than this project. Then again, ypy is somewhat unmaintained and I do wonder if this request indeed falls outside of the future of this project.

Better syntax for accessing existing shared types

Problem

Currently, one must bind an empty shared type to Ydoc keys before accessing their values:

    ydoc["cells"] = Array()
    assert ydoc["cells"].to_py() == [{"metadata": {"foo": "bar"}, "source": "1 + 2"}]
    #      ^- not equal to the empty Array() assigned to this key immediately before,
    #         but rather the value coming from another provider 

There are two drawbacks to only supporting this way of accessing a shared type:

  1. This requires 1 additional line of code per shared type for the assignment statement, and can get verbose if one is using many. However, in Yjs, ydoc.getArray(...) can be used inline.
  • The inline assignment operator := (available in Python 3.8+) does not work either:
/Users/dlq/micromamba/envs/rtcdev/lib/python3.11/ast.py:50: in parse
    return compile(source, filename, mode, flags,
E     File "/Volumes/workplace/pycrdt-websocket/tests/test_pycrdt_yjs.py", line 92
E       assert (ydoc["cells"] := Array()).to_py() == [{"metadata": {"foo": "bar"}, "source": "1 + 2"}]
E               ^^^^^^^^^^^^^
E   SyntaxError: cannot use assignment expressions with subscript
  • From @davidbrochart:

    BTW you can write doc["my_array"] = my_array = Array() if you want a one-liner.

  1. The syntax requires assigning an empty shared type in order to access an existing, non-empty shared type. This is stateful and confusing, because the value of ydoc["cells"] is not the value of ydoc["cells"] that was just set in the immediately preceding line. This generally violates how most programming languages work. I understand that this is permitted by Python, but the public API should not rely on exotic behavior exclusive to Python.

Proposed Solution

TBD.

Additional context

Allow nested transactions

Currently, nested transactions are not allowed because that would lead to multiple TransactionMut on a document:

def foo(doc):
    with doc.transaction():
        text = doc.get_text("text")
        text += ", World!"

with doc.transaction():
    text = doc.get_text("text")
    text += "Hello"
    foo(doc)  # will fail

This hurts modularity, for instance if we wanted foo to be used independently. Now foo has to check if there is already a transaction on the document:

def foo(doc, txn=None):
    if txn is None:
        with doc.transaction():
            text = doc.get_text("text")
            text += ", World!"
    else:
        text = doc.get_text("text")
        text += ", World!"

with doc.transaction() as txn:
    text = doc.get_text("text")
    text += "Hello"
    foo(doc, txn)

See an example of such a workaround in jupyter-ydoc. This is not only more complicated, but this doesn't even do what is expected: the changes in foo are "merged" into the parent transaction, which might not be desirable because we wanted them to be grouped into their own transaction.
I think that for nested transactions to work, the context manager should only create the transaction at exit, and make the changes then. This means that every change made in the context manager should be registered first.

y-websocket provider

coming from this issue y-crdt/ypy#154

i don't find how to synchronize map changes between different python clients. using websocket, with awareness...
can someone give me an example ?

New strategy for document validation

Problem

The current method for validating documents is having a copy of the document, applying changes to it, and if it's still a valid document, applying the changes to the original document. It's expensive because documents and operations are duplicated.

Proposed Solution

If updates are stored e.g. in a YStore, a better solution could be to always apply changes to the original document, and if it fails validation, create a new document from the stored updates (and not store the last update).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.