GithubHelp home page GithubHelp logo

technologicat / unpythonic Goto Github PK

View Code? Open in Web Editor NEW
87.0 6.0 3.0 5.4 MB

Supercharge your Python with parts of Lisp and Haskell.

License: Other

Python 99.97% Shell 0.03%
python python3 functional-programming language-extension syntactic-macros tail-call-optimization tco continuations currying dynamic-variable

unpythonic's Introduction

Unpythonic: Python meets Lisp and Haskell

In the spirit of toolz, we provide missing features for Python, mainly from the list processing tradition, but with some Haskellisms mixed in. We extend the language with a set of syntactic macros. We also provide an in-process, background REPL server for live inspection and hot-patching. The emphasis is on clear, pythonic syntax, making features work together, and obsessive correctness.

100% Python supported language versions supported implementations CI status codecov
version on PyPI PyPI package format dependency status
license: BSD open issues PRs welcome

Some hypertext features of this README, such as local links to detailed documentation, and expandable example highlights, are not supported when viewed on PyPI; view on GitHub to have those work properly.

Dependencies

None required.

  • mcpyrate optional, to enable the syntactic macro layer, an interactive macro REPL, and some example dialects.

The 0.15.x series should run on CPython 3.6, 3.7, 3.8, 3.9 and 3.10, and PyPy3 (language versions 3.6, 3.7 and 3.8); the CI process verifies the tests pass on those platforms. Long-term support roadmap.

Documentation

The features of unpythonic are built out of, in increasing order of magic:

  • Pure Python (e.g. batteries for itertools),
  • Macros driving a pure-Python core (do, let),
  • Pure macros (e.g. continuations, lazify, dbg).
  • Whole-module transformations, a.k.a. dialects (e.g. Lispy).

This depends on the purpose of each feature, as well as ease-of-use considerations. See the design notes for more information.

Examples

Small, limited-space overview of the overall flavor. There is a lot more that does not fit here, especially in the pure-Python feature set. We give here simple examples that are not necessarily of the most general form supported by the constructs. See the full documentation and unit tests for more examples.

Unpythonic in 30 seconds: Pure Python

Loop functionally, with tail call optimization.

[docs]

from unpythonic import looped, looped_over

@looped
def result(loop, acc=0, i=0):
    if i == 10:
        return acc
    else:
        return loop(acc + i, i + 1)  # tail call optimized, no call stack blowup.
assert result == 45

@looped_over(range(3), acc=[])
def result(loop, i, acc):
    acc.append(lambda x: i * x)  # fresh "i" each time, no mutation of loop counter.
    return loop()
assert [f(10) for f in result] == [0, 10, 20]
Introduce dynamic variables.

[docs]

from unpythonic import dyn, make_dynvar

make_dynvar(x=42)  # set a default value

def f():
    assert dyn.x == 17
    with dyn.let(x=23):
        assert dyn.x == 23
        g()
    assert dyn.x == 17

def g():
    assert dyn.x == 23

assert dyn.x == 42
with dyn.let(x=17):
    assert dyn.x == 17
    f()
assert dyn.x == 42
Interactively hot-patch your running Python program.

[docs]

To opt in, add just two lines of code to your main program:

from unpythonic.net import server
server.start(locals={})  # automatically daemonic

import time

def main():
    while True:
        time.sleep(1)

if __name__ == '__main__':
    main()

Or if you just want to take this for a test run, start the built-in demo app:

python3 -m unpythonic.net.server

Once a server is running, to connect:

python3 -m unpythonic.net.client 127.0.0.1

This gives you a REPL, inside your live process, with all the power of Python. You can importlib.reload any module, and through sys.modules, inspect or overwrite any name at the top level of any module. You can pickle.dump your data. Or do anything you want with/to the live state of your app.

You can have multiple REPL sessions connected simultaneously. When your app exits (for any reason), the server automatically shuts down, closing all connections if any remain. But exiting the client leaves the server running, so you can connect again later - that's the whole point.

Optionally, if you have mcpyrate, the REPL sessions support importing, invoking and defining macros.

Industrial-strength scan and fold.

[docs]

Scan and fold accept multiple iterables, like in Racket.

from operator import add
from unpythonic import scanl, foldl, unfold, take, Values

assert tuple(scanl(add, 0, range(1, 5))) == (0, 1, 3, 6, 10)

def op(e1, e2, acc):
    return acc + e1 * e2
assert foldl(op, 0, (1, 2), (3, 4)) == 11

def nextfibo(a, b):
    return Values(a, a=b, b=a + b)
assert tuple(take(10, unfold(nextfibo, 1, 1))) == (1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
Industrial-strength curry.

[docs]

We bind arguments to parameters like Python itself does, so it does not matter whether arguments are passed by position or by name during currying. We support @generic multiple-dispatch functions.

We also feature a Haskell-inspired passthrough system: any args and kwargs that are not accepted by the call signature will be passed through. This is useful when a curried function returns a new function, which is then the target for the passthrough. See the docs for details.

from unpythonic import curry, generic, foldr, composerc, cons, nil, ll

@curry
def f(x, y):
    return x, y

assert f(1, 2) == (1, 2)
assert f(1)(2) == (1, 2)
assert f(1)(y=2) == (1, 2)
assert f(y=2)(x=1) == (1, 2)

@curry
def add3(x, y, z):
    return x + y + z

# actually uses partial application so these work, too
assert add3(1)(2)(3) == 6
assert add3(1, 2)(3) == 6
assert add3(1)(2, 3) == 6
assert add3(1, 2, 3) == 6

@curry
def lispyadd(*args):
    return sum(args)
assert lispyadd() == 0  # no args is a valid arity here

@generic
def g(x: int, y: int):
    return "int"
@generic
def g(x: float, y: float):
    return "float"
@generic
def g(s: str):
    return "str"
g = curry(g)

assert callable(g(1))
assert g(1)(2) == "int"

assert callable(g(1.0))
assert g(1.0)(2.0) == "float"

assert g("cat") == "str"
assert g(s="cat") == "str"

# simple example of passthrough
mymap = lambda f: curry(foldr, composerc(cons, f), nil)
myadd = lambda a, b: a + b
assert curry(mymap, myadd, ll(1, 2, 3), ll(2, 4, 6)) == ll(3, 6, 9)
Multiple-dispatch generic functions, like in CLOS or Julia.

[docs]

from unpythonic import generic

@generic
def my_range(stop: int):  # create the generic function and the first multimethod
    return my_range(0, 1, stop)
@generic
def my_range(start: int, stop: int):  # further registrations add more multimethods
    return my_range(start, 1, stop)
@generic
def my_range(start: int, step: int, stop: int):
    return start, step, stop

This is a purely run-time implementation, so it does not give performance benefits, but it can make code more readable, and makes it modular to add support for new input types (or different call signatures) to an existing function later.

Holy traits are also a possibility:

import typing
from unpythonic import generic, augment

class FunninessTrait:
    pass
class IsFunny(FunninessTrait):
    pass
class IsNotFunny(FunninessTrait):
    pass

@generic
def funny(x: typing.Any):  # default
    raise NotImplementedError(f"`funny` trait not registered for anything matching {type(x)}")

@augment(funny)
def funny(x: str):  # noqa: F811
    return IsFunny()
@augment(funny)
def funny(x: int):  # noqa: F811
    return IsNotFunny()

@generic
def laugh(x: typing.Any):
    return laugh(funny(x), x)

@augment(laugh)
def laugh(traitvalue: IsFunny, x: typing.Any):
    return f"Ha ha ha, {x} is funny!"
@augment(laugh)
def laugh(traitvalue: IsNotFunny, x: typing.Any):
    return f"{x} is not funny."

assert laugh("that") == "Ha ha ha, that is funny!"
assert laugh(42) == "42 is not funny."
Conditions: resumable, modular error handling, like in Common Lisp.

[docs]

Contrived example:

from unpythonic import error, restarts, handlers, invoke, use_value, unbox

class MyError(ValueError):
    def __init__(self, value):  # We want to act on the value, so save it.
        self.value = value

def lowlevel(lst):
    _drop = object()  # gensym/nonce
    out = []
    for k in lst:
        # Provide several different error recovery strategies.
        with restarts(use_value=(lambda x: x),
                      halve=(lambda x: x // 2),
                      drop=(lambda: _drop)) as result:
            if k > 9000:
                error(MyError(k))
            # This is reached when no error occurs.
            # `result` is a box, send k into it.
            result << k
        # Now the result box contains either k,
        # or the return value of one of the restarts.
        r = unbox(result)  # get the value from the box
        if r is not _drop:
            out.append(r)
    return out

def highlevel():
    # Choose which error recovery strategy to use...
    with handlers((MyError, lambda c: use_value(c.value))):
        assert lowlevel([17, 10000, 23, 42]) == [17, 10000, 23, 42]

    # ...on a per-use-site basis...
    with handlers((MyError, lambda c: invoke("halve", c.value))):
        assert lowlevel([17, 10000, 23, 42]) == [17, 5000, 23, 42]

    # ...without changing the low-level code.
    with handlers((MyError, lambda: invoke("drop"))):
        assert lowlevel([17, 10000, 23, 42]) == [17, 23, 42]

highlevel()

Conditions only shine in larger systems, with restarts set up at multiple levels of the call stack; this example is too small to demonstrate that. The single-level case here could be implemented as a error-handling mode parameter for the example's only low-level function.

With multiple levels, it becomes apparent that this mode parameter must be threaded through the API at each level, unless it is stored as a dynamic variable (see unpythonic.dyn). But then, there can be several types of errors, and the error-handling mode parameters - one for each error type - have to be shepherded in an intricate manner. A stack is needed, so that an inner level may temporarily override the handler for a particular error type...

The condition system is the clean, general solution to this problem. It automatically scopes handlers to their dynamic extent, and manages the handler stack automatically. In other words, it dynamically binds error-handling modes (for several types of errors, if desired) in a controlled, easily understood manner. The local programmability (i.e. the fact that a handler is not just a restart name, but an arbitrary function) is a bonus for additional flexibility.

If this sounds a lot like an exception system, that's because conditions are the supercharged sister of exceptions. The condition model cleanly separates mechanism from policy, while otherwise remaining similar to the exception model.

Lispy symbol type.

[docs]

Roughly, a symbol is a guaranteed-interned string.

A gensym is a guaranteed-unique string, which is useful as a nonce value. It's similar to the pythonic idiom nonce = object(), but with a nice repr, and object-identity-preserving pickle support.

from unpythonic import sym  # lispy symbol
sandwich = sym("sandwich")
hamburger = sym("sandwich")  # symbol's identity is determined by its name, only
assert hamburger is sandwich

assert str(sandwich) == "sandwich"  # symbols have a nice str()
assert repr(sandwich) == 'sym("sandwich")'  # and eval-able repr()
assert eval(repr(sandwich)) is sandwich

from pickle import dumps, loads
pickled_sandwich = dumps(sandwich)
unpickled_sandwich = loads(pickled_sandwich)
assert unpickled_sandwich is sandwich  # symbols survive a pickle roundtrip

from unpythonic import gensym  # gensym: make new uninterned symbol
tabby = gensym("cat")
scottishfold = gensym("cat")
assert tabby is not scottishfold

pickled_tabby = dumps(tabby)
unpickled_tabby = loads(pickled_tabby)
assert unpickled_tabby is tabby  # also gensyms survive a pickle roundtrip
Lispy data structures.

[docs for box] [docs for cons] [docs for frozendict]

from unpythonic import box, unbox  # mutable single-item container
cat = object()
cardboardbox = box(cat)
assert cardboardbox is not cat  # the box is not the cat
assert unbox(cardboardbox) is cat  # but the cat is inside the box
assert cat in cardboardbox  # ...also syntactically
dog = object()
cardboardbox << dog  # hey, it's my box! (replace contents)
assert unbox(cardboardbox) is dog

from unpythonic import cons, nil, ll, llist  # lispy linked lists
lst = cons(1, cons(2, cons(3, nil)))
assert ll(1, 2, 3) == lst  # make linked list out of elements
assert llist([1, 2, 3]) == lst  # convert iterable to linked list

from unpythonic import frozendict  # immutable dictionary
d1 = frozendict({'a': 1, 'b': 2})
d2 = frozendict(d1, c=3, a=4)
assert d1 == frozendict({'a': 1, 'b': 2})
assert d2 == frozendict({'a': 4, 'b': 2, 'c': 3})
Allow a lambda to call itself. Name a lambda.

[docs for withself] [docs for namelambda]

from unpythonic import withself, namelambda

fact = withself(lambda self, n: n * self(n - 1) if n > 1 else 1)  # see @trampolined to do this with TCO
assert fact(5) == 120

square = namelambda("square")(lambda x: x**2)
assert square.__name__ == "square"
assert square.__qualname__ == "square"  # or e.g. "somefunc.<locals>.square" if inside a function
assert square.__code__.co_name == "square"  # used by stack traces
Break infinite recursion cycles.

[docs]

from typing import NoReturn
from unpythonic import fix

@fix()
def a(k):
    return b((k + 1) % 3)
@fix()
def b(k):
    return a((k + 1) % 3)
assert a(0) is NoReturn
Build number sequences by example. Slice general iterables.

[docs for s] [docs for islice]

from unpythonic import s, islice

seq = s(1, 2, 4, ...)
assert tuple(islice(seq)[:10]) == (1, 2, 4, 8, 16, 32, 64, 128, 256, 512)
Memoize functions and generators.

[docs for memoize] [docs for gmemoize]

from itertools import count, takewhile
from unpythonic import memoize, gmemoize, islice

ncalls = 0
@memoize  # <-- important part
def square(x):
    global ncalls
    ncalls += 1
    return x**2
assert square(2) == 4
assert ncalls == 1
assert square(3) == 9
assert ncalls == 2
assert square(3) == 9
assert ncalls == 2  # called only once for each unique set of arguments

# "memoize lambda": classic evaluate-at-most-once thunk
thunk = memoize(lambda: print("hi from thunk"))
thunk()  # the message is printed only the first time
thunk()

@gmemoize  # <-- important part
def primes():  # FP sieve of Eratosthenes
    yield 2
    for n in count(start=3, step=2):
        if not any(n % p == 0 for p in takewhile(lambda x: x*x <= n, primes())):
            yield n

assert tuple(islice(primes())[:10]) == (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
Functional updates.

[docs]

from itertools import repeat
from unpythonic import fup

t = (1, 2, 3, 4, 5)
s = fup(t)[0::2] << repeat(10)
assert s == (10, 2, 10, 4, 10)
assert t == (1, 2, 3, 4, 5)

from itertools import count
from unpythonic import imemoize
t = (1, 2, 3, 4, 5)
s = fup(t)[::-2] << imemoize(count(start=10))()
assert s == (12, 2, 11, 4, 10)
assert t == (1, 2, 3, 4, 5)
Live list slices.

[docs]

from unpythonic import view

lst = list(range(10))
v = view(lst)[::2]  # [0, 2, 4, 6, 8]
v[2:4] = (10, 20)  # re-slicable, still live.
assert lst == [0, 1, 2, 3, 10, 5, 20, 7, 8, 9]

lst[2] = 42
assert v == [0, 42, 10, 20, 8]
Pipes: method chaining syntax for regular functions.

[docs]

from unpythonic import piped, exitpipe

double = lambda x: 2 * x
inc    = lambda x: x + 1
x = piped(42) | double | inc | exitpipe
assert x == 85

The point is usability: in a function composition using pipe syntax, data flows from left to right.

Unpythonic in 30 seconds: Language extensions with macros

unpythonic.test.fixtures: a minimalistic test framework for macro-enabled Python.

[docs]

from unpythonic.syntax import macros, test, test_raises, fail, error, warn, the
from unpythonic.test.fixtures import session, testset, terminate, returns_normally

def f():
    raise RuntimeError("argh!")

def g(a, b):
    return a * b
    fail["this line should be unreachable"]

count = 0
def counter():
    global count
    count += 1
    return count

with session("simple framework demo"):
    with testset():
        test[2 + 2 == 4]
        test_raises[RuntimeError, f()]
        test[returns_normally(g(2, 3))]
        test[g(2, 3) == 6]
        # Use `the[]` (or several) in a `test[]` to declare what you want to inspect if the test fails.
        # Implicit `the[]`: in comparison, the LHS; otherwise the whole expression. Used if no explicit `the[]`.
        test[the[counter()] < the[counter()]]

    with testset("outer"):
        with testset("inner 1"):
            test[g(6, 7) == 42]
        with testset("inner 2"):
            test[None is None]
        with testset("inner 3"):  # an empty testset is considered 100% passed.
            pass
        with testset("inner 4"):
            warn["This testset not implemented yet"]

    with testset("integration"):
        try:
            import blargly
        except ImportError:
            error["blargly not installed, cannot test integration with it."]
        else:
            ... # blargly integration tests go here

    with testset(postproc=terminate):
        test[2 * 2 == 5]  # fails, terminating the nearest dynamically enclosing `with session`
        test[2 * 2 == 4]  # not reached

We provide the low-level syntactic constructs test[], test_raises[] and test_signals[], with the usual meanings. The last one is for testing code that uses conditions and restarts; see unpythonic.conditions.

The test macros also come in block variants, with test, with test_raises, with test_signals.

As usual in test frameworks, the testing constructs behave somewhat like assert, with the difference that a failure or error will not abort the whole unit (unless explicitly asked to do so).

let: expression-local variables.

[docs]

from unpythonic.syntax import macros, let, letseq, letrec

x = let[[a << 1, b << 2] in a + b]
y = letseq[[c << 1,  # LET SEQuential, like Scheme's let*
            c << 2 * c,
            c << 2 * c] in
           c]
z = letrec[[evenp << (lambda x: (x == 0) or oddp(x - 1)),  # LET mutually RECursive, like in Scheme
            oddp << (lambda x: (x != 0) and evenp(x - 1))]
           in evenp(42)]
let-over-lambda: stateful functions.

[docs]

from unpythonic.syntax import macros, dlet

# Up to Python 3.8, use `@dlet(x << 0)` instead
@dlet[x << 0]  # let-over-lambda for Python
def count():
    return x << x + 1  # `name << value` rebinds in the let env
assert count() == 1
assert count() == 2
do: code imperatively in any expression position.

[docs]

from unpythonic.syntax import macros, do, local, delete

x = do[local[a << 21],
       local[b << 2 * a],
       print(b),
       delete[b],  # do[] local variables can be deleted, too
       4 * a]
assert x == 84
Automatically apply tail call optimization (TCO), à la Scheme/Racket.

[docs]

from unpythonic.syntax import macros, tco

with tco:
    # expressions are automatically analyzed to detect tail position.
    evenp = lambda x: (x == 0) or oddp(x - 1)
    oddp  = lambda x: (x != 0) and evenp(x - 1)
    assert evenp(10000) is True
Curry automatically, à la Haskell.

[docs]

from unpythonic.syntax import macros, autocurry
from unpythonic import foldr, composerc as compose, cons, nil, ll

with autocurry:
    def add3(a, b, c):
        return a + b + c
    assert add3(1)(2)(3) == 6

    mymap = lambda f: foldr(compose(cons, f), nil)
    double = lambda x: 2 * x
    assert mymap(double, (1, 2, 3)) == ll(2, 4, 6)
Lazy functions, a.k.a. call-by-need.

[docs]

from unpythonic.syntax import macros, lazify

with lazify:
    def my_if(p, a, b):
        if p:
            return a  # b never evaluated in this code path
        else:
            return b  # a never evaluated in this code path
    assert my_if(True, 23, 1/0) == 23
    assert my_if(False, 1/0, 42) == 42
Genuine multi-shot continuations (call/cc).

[docs]

from unpythonic.syntax import macros, continuations, call_cc

with continuations:  # enables also TCO automatically
    # McCarthy's amb() operator
    stack = []
    def amb(lst, cc):
        if not lst:
            return fail()
        first, *rest = tuple(lst)
        if rest:
            remaining_part_of_computation = cc
            stack.append(lambda: amb(rest, cc=remaining_part_of_computation))
        return first
    def fail():
        if stack:
            f = stack.pop()
            return f()

    # Pythagorean triples using amb()
    def pt():
        z = call_cc[amb(range(1, 21))]  # capture continuation, auto-populate cc arg
        y = call_cc[amb(range(1, z+1))]
        x = call_cc[amb(range(1, y+1))]
        if x*x + y*y != z*z:
            return fail()
        return x, y, z
    t = pt()
    while t:
        print(t)
        t = fail()  # note pt() has already returned when we call this.

Unpythonic in 30 seconds: Language extensions with dialects

The dialects subsystem of mcpyrate makes Python into a language platform, à la Racket. We provide some example dialects based on unpythonic's macro layer. See documentation.

Lispython: automatic TCO and an implicit return statement.

[docs]

Also comes with automatically named, multi-expression lambdas.

from unpythonic.dialects import dialects, Lispython  # noqa: F401

def factorial(n):
    def f(k, acc):
        if k == 1:
            return acc
        f(k - 1, k * acc)
    f(n, acc=1)
assert factorial(4) == 24
factorial(5000)  # no crash

square = lambda x: x**2
assert square(3) == 9
assert square.__name__ == "square"

# - brackets denote a multiple-expression lambda body
#   (if you want to have one expression that is a literal list,
#    double the brackets: `lambda x: [[5 * x]]`)
# - local[name << value] makes an expression-local variable
g = lambda x: [local[y << 2 * x],
               y + 1]
assert g(10) == 21
Pytkell: Automatic currying and implicitly lazy functions.

[docs]

from unpythonic.dialects import dialects, Pytkell  # noqa: F401

from operator import add, mul

def addfirst2(a, b, c):
    return a + b
assert addfirst2(1)(2)(1 / 0) == 3

assert tuple(scanl(add, 0, (1, 2, 3))) == (0, 1, 3, 6)
assert tuple(scanr(add, 0, (1, 2, 3))) == (0, 3, 5, 6)

my_sum = foldl(add, 0)
my_prod = foldl(mul, 1)
my_map = lambda f: foldr(compose(cons, f), nil)
assert my_sum(range(1, 5)) == 10
assert my_prod(range(1, 5)) == 24
double = lambda x: 2 * x
assert my_map(double, (1, 2, 3)) == ll(2, 4, 6)
Listhell: Prefix syntax for function calls, and automatic currying.

[docs]

from unpythonic.dialects import dialects, Listhell  # noqa: F401

from operator import add, mul
from unpythonic import foldl, foldr, cons, nil, ll

(print, "hello from Listhell")

my_sum = (foldl, add, 0)
my_prod = (foldl, mul, 1)
my_map = lambda f: (foldr, (compose, cons, f), nil)
assert (my_sum, (range, 1, 5)) == 10
assert (my_prod, (range, 1, 5)) == 24
double = lambda x: 2 * x
assert (my_map, double, (q, 1, 2, 3)) == (ll, 2, 4, 6)

Installation

PyPI

pip3 install unpythonic --user

or

sudo pip3 install unpythonic

GitHub

Clone (or pull) from GitHub. Then,

python3 setup.py install --user

or

sudo python3 setup.py install

Uninstall

Uninstallation must be invoked in a folder which has no subfolder called unpythonic, so that pip recognizes it as a package name (instead of a filename). Then,

pip3 uninstall unpythonic

or

sudo pip3 uninstall unpythonic

Support

Not working as advertised? Missing a feature? Documentation needs improvement?

In case of a problem, see Troubleshooting first. Then:

Issue reports and pull requests are welcome. Contribution guidelines.

While unpythonic is intended as a serious tool for improving productivity as well as for teaching, right now my work priorities mean that it's developed and maintained on whatever time I can spare for it. Thus getting a response may take a while, depending on which project I happen to be working on.

License

All original code is released under the 2-clause BSD license.

For sources and licenses of fragments originally seen on the internet, see AUTHORS.

Acknowledgements

Thanks to TUT for letting me teach RAK-19006 in spring term 2018; early versions of parts of this library were originally developed as teaching examples for that course. Thanks to @AgenttiX for early feedback.

Relevant reading

Links to blog posts, online articles and papers on topics relevant in the context of unpythonic have been collected to a separate document.

If you like both FP and numerics, we have some examples based on various internet sources.

unpythonic's People

Contributors

aisha-w avatar technologicat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

unpythonic's Issues

macropy3 bootstrapper moved to imacropy

The macropy3 bootstrapper now lives in the separate imacropy repo, since it's a general MacroPy add-on, not specific to unpythonic.

So, in the name of DRY, in 0.15 we should drop the local copy macro_extras/macropy3, add imacropy to the requires in setup.py, and update the documentation to reflect this.

(However, this will have the side effect that installing unpythonic will pull in macropy3, even for users who don't need or want macros. This makes it look like a hard dependency, even though it is actually optional - only needed if you want the macro-based language extensions. If anyone has a better approach, please comment.)

Finish implementing unpythonic.fix.fix

A.k.a. the function that detects and breaks recursion cycles. Related to Haskell's fix, but not the same thing.

Minimal TODO for a release-worthy version:

  • DONE: Allow kwargs for f, everything else does.
  • DONE: Make the implementation thread-safe, almost everything else is.
  • DONE: Move unit tests to a new separate module unpythonic/test/test_fix.py.
  • Investigate (but not necessarily resolve yet) whether we can reorganize things so we could guarantee to call bottom at most once. (Currently it's at least once, at most twice.)
  • Document it in README, including credits for the original idea (Matthew Might) and initial Python implementation (Per Vognsen).

Add unpythonic.it.lastn

Curiously, I've previously added a butlastn, but no lastn.

In 0.14.1, it's possible to last(window(iterable, n)) to get the effect of lastn(iterable, n), but the performance is likely suboptimal, since that's doing way too many operations for something this simple.

Fix this. Just make the argument to the deque configurable. Implement last in terms of lastn.

Improve filename computation for syntax.dbg

Just to mark a future TODO.

The issue with using dbg[] in the REPL is that __file__ is not defined. There's a filename computation in the stdlib's warnings.py that addresses exactly this.

  • Maybe we could adapt the solution for our purposes?
  • Performance implications, may need to run the code at each call site of dbg? Is the REPL support for syntax.dbg worth it?

Fix shadowed submodule names

Some constructs such as unpythonic.env.env and unpythonic.llist.llist have the same name as the submodule they live in. This hides the submodule from the dict of the top-level namespace of the unpythonic module. This behavior should be considered a bug.

To improve discoverability (e.g. groups of functionality; submodule top-level docstrings), the offending modules should be renamed to avoid becoming shadowed.

It's better to rename the modules than the symbols, since the symbols are certainly used by client code, but the submodule names necessarily aren't. It's not wrong to import directly from a submodule of unpythonic, if that feels more readable at the use site, but it's not mandatory, either, since the top-level __init__ imports all public symbols into the top-level unpythonic namespace. (This behavior is part of the public API, and is not going away.)

This is a breaking change, so we'll wait until the next major version before we do it.

CAUTION: Renaming the modules will break unpickling for instances that were pickled using the old version. [1] says that pickle can save and restore class instances transparently, however the class definition must be importable and live in the same module as when the object was stored. (emphasis mine)

Improve documentation

Over time, the feature set of unpythonic has become larger than can be briefly explained in a simple README. The documentation needs rethinking.

EDIT: As of v0.15.0, some of this has been done. See at the end of this thread for the current TODOs.


  • Can we shorten the text without losing important information? Especially this is a #helpwanted.
  • Within the explanation of each feature, the main text could use copyediting. Important points first, details after.
    • E.g. as of 0.14.1, in the documentation of fupdate, the fup function, which is the recommended everyday API, is currently only mentioned as an afterthought, after explaining the low-level fupdate API in detail. (Well, it came first, but documentation should follow logic, not history.) This particular point is fixed in 0.14.2, but more similar issues may remain.
  • Make docstrings and the separate documentation complement each other.
    • Right now there's quite a lot of duplication between them, often with one of them being an older revision of the text in the other one (often the separate documentation is more polished than the docstring).
    • A policy of orthogonal roles might help, such as:
      • A docstring is brief, to the point. Its role is to give the semantic and usage details of a particular feature (with examples), briefly explain any caveats, and mention any relevant see-alsos (at least when not obvious).
      • The separate documentation is the user manual, with the usual bells and whistles such as a TOC with section links, and pretty formatting (as far as reasonably possible in Markdown).
        • The current documentation is already a rough attempt at this.
        • Ideally, we could use a narrative format, like Pyramid's?
        • How to optimize the documentation for discoverability of features?
        • How far to go in explaining the use cases of each feature?
          • The whole point of unpythonic is to do unpythonic things, pythonically. Some of the features may seem weird to readers not familiar with the particular sources of inspiration, until given the context: what the problem solved by a particular feature is.
          • Most features are useful in production, while there are a few that are primarily useful for teaching concepts (continuations, lazify). These should probably be approached differently.
            • The teaching features aim at robustness, too, but I'm not completely sure I want a production codebase that claims to be Python, yet uses call-by-need semantics. Forming an informed opinion on things like this first requires some prototyping with smaller projects.
          • On the other hand, for example the general promotion of FP is beyond the scope of unpythonic, even though it matters; particularly, FP enables building complex functionality by composition, and precisely this is where curry comes in useful.

Previous, already done items:

  • DONE: README should play the role of a short tour-type overview. Done in 0.14.2.
    • Currently both READMEs are long, which makes them scary. But that's because they contain the motivation and examples for each and every feature, no matter how minor.
    • The macro tour could live in a separate README (just like the macro documentation already does), so users who don't want to depend on MacroPy don't have to wade through what is easily 50% of the total documentation. Just mention it at the beginning, like we already do.
    • Thanks to aisha-w for moving the pure-Python API docs to doc/features.md, which is a much better place for that much detail. Version 0.14.2 adds a short demo to the README to give a short overview.
  • DONE: The detailed documentation could live in separate files. Contributed by @aisha-w, see #28.
    • This would allow having a relatively stable front-page tour, only needing an update when new major features are added.
    • Keep the authoring light; just use .md or .rst.
  • DONE: Clean up the history clutter. The 0.14.1 release has been out for a while. Contributed by @aisha-w, see #28.
    • Integrate the information from the "changed in vx.yy" notes into the main text.
    • Drop the "added in vx.yy" notes.
    • Add one short note at the start of the README, documenting this change. Not really needed.

Update docs for 0.14.2

Document the new features in README. Mention bug fixes in changelog. See recent commits for details.

This part done. See below for any remaining points, especially the megapost and readings.

Add unbox and box.set

To be more rackety, we could:

o = object()
b = box(o)
assert unbox(b) is o

o2 = "oxygen"
b.set(o2)
assert unbox(b) is o2

This hides the essentially-internal name of the data field, .x.

Interaction between fix and tco

The recursion cycle breaker unpythonic.fix.fix and the TCO mechanism do not play well together, because they both need to set up a harness in which to run the original function.

Currently the harnesses will nest, which blows the call stack even earlier when fix is applied to a function that is trampolined, as opposed to a plain Python function.

It would be nice to have TCO support in fix in some future release, but this needs some thinking. Is it possible to design a modular harness?

Add unpythonic.dyn.set to update dynamic bindings

Since SRFI-39 and Racket can do it, maybe we should, too.

Add support for updating dynamic bindings in the closest enclosing cell that has the given name bound. Syntax could be e.g. dyn.set(**bindings). Raise an error (NameError or AttributeError, whichever dyn currently uses) if a binding does not already exist.

Add arithmetic fixpoint function

Dumping old notes. This needs stuff from unpythonic.it. Maybe put it there, or just for disambiguation purposes, into unpythonic.fix?

Note we should also switch the order of arguments of within; the way it's currently implemented, it's not curry-friendly, while this assumes a curry-friendly implementation. The price is no default tol, but this just simplifies the implementation, so maybe it's a good idea. Also within(tol, iterable) is better English than within(iterable, tol).

def fixpoint(f, x0, tol=0):
    """Compute the (arithmetic) fixed point of f, starting from the initial guess x0.

    (Not to be confused with the logical fixed point with respect to the
    definedness ordering.)

    The fixed point must be attractive for this to work. See the Banach
    fixed point theorem.
    https://en.wikipedia.org/wiki/Banach_fixed-point_theorem

    If the fixed point is attractive, and the values are represented in
    floating point (hence finite precision), the computation should
    eventually converge down to the last bit (barring roundoff or
    catastrophic cancellation in the final few steps). Hence the default tol
    of zero.

    CAUTION: an arbitrary function from ℝ to ℝ **does not** necessarily
    have a fixed point. Limit cycles and chaotic behavior of `f` will cause
    non-termination. Keep in mind the classic example:
    https://en.wikipedia.org/wiki/Logistic_map

    Examples::
        import math
        from unpythonic import fixpoint
        c = fixpoint(math.cos, x0=1)

        # Actually "Newton's" algorithm for the square root was already known to the
        # ancient Babylonians, ca. 2000 BCE. (Carl Boyer: History of mathematics)
        def sqrt_newton(n):
            def sqrt_iter(x):  # has an attractive fixed point at sqrt(n)
                return (x + n / x) / 2
            return fixpoint(sqrt_iter, x0=n / 2)
        s = sqrt_newton(2)  # ≈ 1.414
    """
    return last(within(tol, repeat1(f, x0)))

Edit: fix year mistake: the ancient civilization Babylonia flourished around 2000 BCE, not 4000 BCE (that would be the beginnings of ancient Egypt).

Archive changelogs

Currently changelogs live only on the GitHub releases page.

To keep the history together with the code, it would probably be better to adopt a development process where changelogs are kept in a CHANGELOG.md in the repo.

When a new version is about to be released, that's where to describe the changes (by adding a new section at the top); when the actual release is made, then the relevant section just needs to be copy'n'pasted to the release page.

Documentation: mark division of features

There are three kinds of features in unpythonic. In the documentation, beside a general explanation, we should clearly tag which individual feature belongs to which category:

  1. Pure-Python-only - no need for macros, meant to be used directly (e.g. batteries for itertools),
  2. Pure-Python core with a macro layer for syntactic sugar (e.g. do, let),
  3. Macro-only (e.g. continuations, lazify, dbg).

In some instances of case 2, the sugaring can be major - for example, the macro version of let hides the unwieldy environment parameter and automatically performs lexical scoping.

This classification leaves some features as sporks. For example, the curry function is just fine for manual use, but automatic currying is only available via the with curry block macro. So it spans categories 1 and 2.

Some other features fall neatly into a single category, but which one? For example, TCO can be used without macros, but one must remember to decorate each function with @trampolined manually, and to return jump(f, ...) instead of return f(...) to make a tail call. The tco macro writes these parts automatically. In such a case, whether or not to rely on macros (i.e. whether to place it in category 1 or 2), is a trade-off each user must decide for themselves.

Case 3 includes mostly features that code-walk and perform major rewrites. However, dbg is an exception - the only reason it is macro-only is that it needs to see the source code.

See #28 .

Python language version support status

Long-term support roadmap

Python support schedule (in chart format; original source). Assuming that unpythonic stays in active development, unpythonic will try to support any Python 3.x that is still in official support by the Python devs.

For particular Python versions, this means that, approximately (updated January 2022):

  • 3.6 supported until January 2022 (may be dropped soon)
  • 3.7 supported until July 2023
  • 3.8 supported until October 2024
  • 3.9 supported until October 2025
  • 3.10 supported until October 2026
  • 3.11 not officially supported; not released yet; no EOL date announced yet

Since it's basically just me doing this, and I only have resources to barely support one unified codebase, this implies that at any moment in time, the unpythonic codebase will be based on the oldest currently supported Python version - with possibly the "new way" included as commented-out code (for easy future porting) if it is much more elegant.

I will not backport bug fixes. Any bug fixes will be released in a new version of unpythonic, with Python version support appropriate for its release date.

Current status of Python language version support in unpythonic is tracked by posting new comments to this issue whenever appropriate.

Original posting below.


Support for Python 3.4 ended in spring 2019, and for 3.5 in September 2020.

Starting in unpythonic 0.15, drop any code specific to 3.4 or 3.5, and bump the Python version requirement to 3.6. That version still has some years of support remaining, and is the default python3 e.g. on recent Ubuntu versions.

This should simplify especially any macro code that needs to handle the AST for function calls.

Allow negative start/end in unpythonic.slicing.islice

Slicing is a uniform, pythonic way to extract subsequences. Practicality beats purity (ZoP §9), so:

Using unpythonic.it.lastn under the hood, in the unpythonic.slicing.islice wrapper we could add support for negative start and end indices.

This would let us say things like:

from unpythonic import islice

# currently no support for negative indices here!
islice(range(10))[-3:]  # --> 7, 8, 9
islice(range(10))[-5:-3]  # --> 5, 6
islice(range(10))[-5::2]  # --> 5, 7, 9

Note step must still be positive; the input is a general iterable, not a sequence.

This is purely a convenience feature. This will force the computation of, and temporarily store, as many items as the length of the slice being indexed. (For a general iterable, the only way to extract the last n items is to force the whole iterable, in order to determine where (if at all) it ends.)

Detection logic for complex-valued math sequences

Currently, s() can handle only real-valued inputs. Complex-valued sequences can be created by composing s(re) + 1j*s(im) or s(mag) * math.exp(1)**(1j*s(arg)), but this is inconvenient.

So, it would be nice to have support for complex-valued sequences.

The complication is that logarithms behave less nicely for complex inputs. If we want to keep supporting both numeric (float, mpmath) and symbolic (SymPy) input, different logic may be needed to correctly detect the sequence for these cases.

Also, just for a pie in the sky, automatic OEIS lookup (if an internet connection is available) would be nice for integer inputs. (Maybe as a separate feature; for many sequences, only the first few elements are known anyway.)

Improve documentation on continuations

Essentially, we provide a rudimentary form of delimited continuations, because the capture ends at the dynamically outermost call_cc[], terminating with an identity continuation. This should be emphasized.

See Undelimited continuations are co-values rather than functions. Include this link in the documentation, beside the one to John Shutt's writings we already have. This addresses the same "continuations are functions" myth independently and from a different angle. Delimited continuations return a value and can be composed, so they are at least function-ish if not actually functions, but undelimited continuations do not even return.

Also, the whole call/cc interface is bad, for reasons outlined in An argument against call/cc. In the long term, if we want to keep this feature, perhaps we should look into shift/reset (Wikipedia has a simple explanation). Maybe mention this as a known issue for now.

(Shutt has also written something about the call/cc interface being unnecessarily esoteric to start with, finally becoming completely unusable in the setting of his Kernel language. Check if it was in the post we have already linked, or in another one, and add a link if necessary.)

Documentation: Emacs syntax highlight for unpythonic

This init.el snippet can be used to make Emacs's python-mode syntax-highlight also keywords specific to MacroPy and unpythonic:

  (defun my/unpythonic-syntax-highlight-setup ()
    "Set up additional syntax highlighting for `unpythonic.syntax' in python mode."
    ;; adapted from code in dash.el
    (let ((new-keywords '("let" "dlet" "blet"
                          "letseq" "dletseq" "bletseq"
                          "letrec" "dletrec" "bletrec"
                          "let_syntax" "abbrev"
                          "where"
                          "do" "local" "delete"
                          "continuations" "call_cc"
                          "curry" "lazify" "envify" "tco" "prefix" "autoreturn" "forall"
                          "multilambda" "namedlambda" "quicklambda"
                          "cond" "aif" "autoref" "dbg" "nb"
                          "macros" "q" "u" "hq" "ast_literal")) ; macropy
          (special-variables '("it"
                               "dyn"
                               "dbgprint_expr")))
      (font-lock-add-keywords 'python-mode `((,(concat "\\_<" (regexp-opt special-variables 'paren) "\\_>")
                                              1 font-lock-variable-name-face)) 'append)
      ;; "(\\s-*" maybe somewhere?
      (font-lock-add-keywords 'python-mode `((,(concat "\\_<" (regexp-opt new-keywords 'paren) "\\_>")
                                              1 font-lock-keyword-face)) 'append)
  ))
  (add-hook 'python-mode-hook 'my/unpythonic-syntax-highlight-setup)

From my init.el.

Known issue: For some reason, during a given session, this takes effect only starting with the second Python file opened. The first Python file opened during a session shows with the default syntax highlighting. Probably something to do with the initialization order of font-lock and whichever python-mode is being used. (Tested with anaconda-mode.)

Pythonify TAGBODY/GO from Common Lisp

See Peter Seibel: Practical Common Lisp, Chapter 20 for an explanation.

Rough draft of how we could pythonify this. User interface:

from unpythonic.syntax import macros, with_tags, tag, go

@with_tags     # <-- decorator macro
def myfunc():  # <-- just a regular function definition
    x = 42
    tag["foo"]  # tag[...] forms must appear at the top level of myfunc
    x += 1
    if x < 10:
        go["foo"]  # go[...] forms may appear anywhere lexically inside myfunc
    return "whatever"

In a with_tags section, a tag[...] form at the top level of the function definition creates a label. The go[...] form jumps to the given label, and may appear anywhere lexically inside the with_tags section. To stay within Python semantics (following the principle of least surprise), myfunc otherwise behaves like a regular function.

Possible macro output:

# actually no explicit import; just use `hq[]` in the macro implementation.
from unpythonic import trampolined, jump

def myfunc():  # may take args and kwargs; we pass them by closure
    @trampolined
    def body():  # gensym'd function name
        nonlocal x
        x = 42
        return jump(foo)
    def foo():  # use the tag name as the function name
        nonlocal x
        x += 1
        if x < 10:
            return jump(foo)
        return "whatever"
    return body()
    x = None  # never reached; just for scoping

Essentially, this is another instance of lambda, the ultimate goto.

Notes:

  • Good performance, since no need to use exceptions for control.
  • Only body (the entry point, called by the expanded myfunc) needs to be trampolined, since none of the inner functions are accessible from the outside (their names being local to myfunc).
  • Use scope analysis (see unpythonic.syntax.scoping) to determine which variable names are local to myfunc in the input. Then scope those to myfunc (by assigning a None at the end), and declare them nonlocal in the helper functions, so that the top level of myfunc forms just one scope (emulating Python's scoping rules), though the macro splits it into helper functions.
    • Beware any names in an existing inner def or comprehension form - those should stay local to that inner def. Only top-level locals matter here. The scoping utility should already take this into account (stopping name collection at scope boundaries).
  • A top-level return from one of the helper functions will shut down the trampoline, and return that value from the TCO chain. This results in a top-level return from myfunc, which is exactly what we want. So we don't have to transform top-level return statements when we generate the expansion.
  • Tags should be collected in an initial pass over the body.
  • Tag names must be unique within the same with_tags section (enforce this in the syntax transformer!), so we can use the tag names as the names of the helper functions. This is also informative in stack traces.
  • Make tag[] and go[] raise an error at runtime if used outside any with_tags. Use macro_stub.

Things to consider:

  • How to support lexical nesting of with_tags sections? Jumping to a tag in a lexically outer with_tags is the complex part. Possible solution below in a separate comment.
    • The current draft doesn't work for that, since what we want is to shut down the inner trampoline, and give the jump(...) to the outer trampoline.
    • Wrap the inside of myfunc in an exception handler, and make go[] to an undefined label raise an exception with instructions where to jump?
      • An unresolved go[] can be allowed to unwind the call stack, because nesting is lexical - when unresolved locally (i.e. by the nearest enclosing with_tags), a go[] can only refer to tags that appear further out.
      • The handler can check if this with_tags section has that label, jump to it if it does, and else re-raise. The exception can have an args message complaining about go[] to a label not defined in any lexically enclosing with_tags section, if uncaught.
      • We must generate code to store the labels for runtime access to support this. A simple assignment to a set is fine.
      • We must then trampoline all of the helper functions, because this kind of jump may enter any of them. But that's no issue - the TCO mechanism already knows how to strip away unwanted trampolines (if tail-calling into a function that has its own trampoline).
      • This solution still doesn't allow us to stash a closure containing a go[] and call it later, after the original function has exited. Short of using general continuations, is that even possible? Does CL handle this case? If so, how?
        • CL indeed doesn't. See footnote 7 of the linked chapter of Seibel: "Likewise, trying to GO to a TAGBODY that no longer exists will cause an error." So it shouldn't matter if we don't, either.
  • Interaction with with continuations? As always, this is the tricky part.
  • Position in xmas tree? At least prefix and autoreturn should go first.
  • Strings or bare names as input to tag[...] and go[...]? A string avoids upsetting IDEs with "undefined" names, but a bare name without quotes is shorter to type.
  • This should be orthogonal from with lazify, since the helper functions take no arguments. If myfunc has parameters, we can allow with lazify to process them as usual.
  • with_tags sections should expand from inside out (so any inner ones are resolved first), so this is a second-pass macro.
  • Implies TCO. Leave an AST marker (see ContinuationsMarker for an example), and modify with tco to leave alone any with_tags sections it encounters.

Add static type information

While useful, this is difficult to do for features like curry and with continuations.

Some selected parts of unpythonic could be eventually gradually typed using mypy, but this is not being worked on at the moment.

See #8 for context.

If anyone feels like working on this, in principle I'm open to suitably scoped PRs adding type signatures to parts of the library (using Python's type annotations) - but perhaps post a comment here for discussion first, to converge on a definition for "suitably scoped".

Idea: auto-resizing infinite iterables for variable initialization

This would be occasionally useful:

from itertools import repeat
from unpythonic.syntax import macros, autotrunc  # DOES NOT EXIST YET

with autotrunc:
    a, b, c = repeat(None)

Concise, and we don't have to manually repeat the unimportant detail of how many names are on the LHS, since this information is already lexically available (just count from the AST).

The equivalent pure Python is:

from itertools import repeat

a, b, c = repeat(None, 3)

The solution must be a block macro, because it must have the assignment LHS in its context in order to count the names there. We can't do anything about this in the function call, or even as an expr macro for the RHS - at that point it's not yet known where the output is going to be sent.

However, is this so useful after all? We need to be careful with edge cases such as:

a, *b, c = repeat(42)  # error, an infinite iterable has no last item
*a, b, c = repeat(42)  # error, likewise
a, b, *c = repeat(42)  # ok, a = 42, b = 42, c = repeat(42)

In the first example, for the specific case of repeat we could define it to extract one 42 and assign it to c, but that doesn't generalize. It's better to refuse the temptation to guess.

Unpacking the start of an arbitrary infinite iterable can be trivially defined by delegating to unpythonic.it.unpack. The third example could expand to:

from unpythonic import unpack
a, b, *c = unpack(2, repeat(42))

(Since an assignment always appears in a block of statements, we can always prepend the import.)

Another case to consider are finite inputs. For example, if len(seq) == 5:

a, *b, c = seq

Python itself already knows to do the right thing here. How would autotrunc avoid putting its nose where it doesn't belong?

Add a short demo to main README

To improve the first impression, provide a runnable demo of an appropriate subset of features right near the start of the main README.

Focus on short and impressive. It does not need to be a complete tour. But it must, quickly, answer the universal first question: why should I bother to read any further?

Things to consider:

  • Showcase preference, i.e. focus the demo on what?
    • Features that currently don't exist in other libraries (continuations)?
    • Features that obviously require a complex implementation?
    • Features that are "done right", esp. in corner cases (e.g. rackety foldl or scanl with multiple inputs, memoize that caches also exceptions)?
    • Features that give a lot of added value and are easy to use?
  • Target audience, i.e. focus the demo for whom?
    • Applied mathematicians and haskellers may like the rule-inferring lazy math sequence constructor s.
    • A general programming audience may feel at home with let or continuations.
  • Complexity balance. The demoed features should be simple enough to explain and use, to facilitate a short code example with a one-liner comment, but complex enough in expected implementation (by a seasoned Python programmer), to communicate that rewriting the same functionality from scratch in five minutes is not realistic.
  • Holistic perspective. How to communicate that unpythonic is essentially a language framework, where (unless specifically otherwise stated) all of the features are designed to work together?

See #8 (mention of executable tagline).

Add unpythonic.it.interleave

Oddly, this one is missing from itertools, so maybe we should have one.

    def interleave(*iterables):
        """Interleave items from several iterables. Generator.

        Example::

            interleave(a, b, c) -> (a0, b0, c0, a1, b1, c1, ...)

        until the shortest input runs out.
        """
        class ShortestInputEnded(Exception):
            pass
        iters = [iter(it) for it in iterables]
        def one_each():
            for it in iters:
                try:
                    x = next(it)
                    yield x
                except StopIteration:
                    raise ShortestInputEnded()
        try:
            while True:
                yield from one_each()
        except ShortestInputEnded:
            return

Refactor tests

Tests need some refactoring:

  • Tease apart the integration tests, currently mixed up wantonly with the unit tests.
    • This applies to both unpythonic/test and unpythonic/syntax/test.
  • To make it easier to locate a regression, make all unit tests run first, and the integration tests after that, so that the first error that appears most likely points out where the bug is.
  • Run the integration tests in a specific order. This could act as further documentation on dependencies in the architecture (what builds on what).
  • Switch to a proper test framework such as pytest. The current runtests.py is a hack.
  • Measure test coverage (at least branch coverage, if not path coverage). Extend tests as required.

(When tackling this, opening sub-issues for each individual point may be a good strategy.)

Top-down presentation style for definitions

Occasionally a top-down presentation style (like Haskell's where) is useful:

from pampy import match, _

from unpythonic.syntax import macros, topdown, definitions  # DOES NOT EXIST YET

with topdown:
    # First say what we want to do...
    result = match(data, pattern1, frobnicate,
                         pattern2, blubnify,
                         ...
                         patternN, percolate)
    # ...and only then give the details.
    with definitions:
        def frobnicate(...):
            ...
        def blubnify(...):
            ...
        ...
        def percolate(...):
            ...

The benefit of the top-down presentation is the reader will already have seen the overall plan when the detailed definitions start. The program logic reads in the same order it plays out.

This would expand to:

from pampy import match, _

# Python wants the details first...
def frobnicate(...):
    ...
def blubnify(...):
    ...
...
def percolate(...):
    ...
# ...so that the names will be bound by the time we do this.
result = match(data, pattern1, frobnicate,
                     pattern2, blubnify,
                     ...
                     patternN, percolate)

In pure Python, it is of course already possible to delay the match call by thunkifying it:

promise = lambda: match(data, pattern1, frobnicate,
                              pattern2, blubnify,
                              ...
                              patternN, percolate)
# ...definitions go here...
result = promise()

This works because - while values are populated dynamically - name resolution occurs lexically, so it's indeed pointing to the intended frobnicate (and it's fine if a value has been assigned to it by the time the lambda is actually called).

However, this introduces a promise, and we have to remember to force it at the end (after the many, many definitions) to actually compute the result. (For a proper evaluate-at-most-once promise, we would use lazy[] from MacroPy, and maybe for readability, force it using unpythonic.force instead of just calling it.)

In use cases like this, the main benefit of inlining lambdas to the match call (e.g. in Racket) is the approximately top-down presentation. The with topdown construct introduces this benefit for named functions.

Since 0.12.0, we already support top-down presentation for expressions:

from unpythonic.syntax import macros, let, where

result = let[2 + a + b,
             where((a, 17),
                   (b, 23))]

The with topdown construct would add top-down presentation for statements.

The goal is to allow arbitrary statements in the with definitions block, although it is recommended to use it only to bind names used by the body (whether by def or by assignment, doesn't matter).

The top-level explicit with topdown is a feature to warn the reader, because top-down style can destroy readability if used globally.

The topdown macro should apply first in the xmas tree.

At this stage the design still needs some thought, e.g.:

  • First pass (outside-in) or second pass (inside-out)?
  • Require with definitions to always appear in tail position, or allow it in any position?
  • Allow several with definitions blocks?
  • If mixing and/or several blocks are allowed, what's the desired evaluation order?
  • Especially, regarding lexically nested with definitions blocks?
  • Better short, descriptive names, instead of topdown and definitions?

Move the macro docs

Now that we have doc/, the macro API docs should move there (e.g. doc/macros.md). Currently they live in macro_extras/README.md for historical reasons.

Beside some docs that now really belong in doc/, the only thing the macro_extras/ folder contains is the macropy3 bootstrapper. For now, we could just move that to the project root level. (The bootstrapper will go away in 0.15.0 anyway, since it's already distributed in the separate PyPI package imacropy.)

We need to be very, very careful in order not to break any links pointing to macro_extras/. For unpythonic itself, helm-swoop and projectile should make short work of that.

But we have to check the pydialect and imacropy projects, too - they may also link to unpythonic macro docs as examples. Any others? Maybe some of my comments in MacroPy issues... (can't really update the entire internet, but to keep the repository clean, once the docs are moved, the old folder has to go.)

Help wanted for testing async support

Help wanted!

The interaction between unpythonic and the async stuff that was added in Python 3.5 is totally untested, because I haven't used, and I'm not even that familiar with, that part of Python myself.

Reading Brett Cannon's explanation, I surmise the async features are intended mainly for "microthreading" typical server loads, which mostly wait on I/O, but must scale to thousands of simultaneous requests. Coming from numerical background where the load is practically always CPU- or memory-bound, I haven't found much use for such constructs, since there the GIL makes them uninteresting. Multiprocessing, Cython + OpenMP, or MPI, exactly one task per core, done.

So, discussion and test cases are more than welcome!

If interested, please post a comment here, or open a PR if you have some tests you'd like to suggest. See directories unpythonic/test and unpythonic/syntax/test for examples. (Any test modules in these directories are autodetected by runtests.py.)

Note that I have absolutely no idea in which kind of use cases unpythonic and async would appear together - that is exactly why this is help wanted.

If you have such a use case and especially if it doesn't work, please share!

(As of 0.14.1, the current level of async support in unpythonic is a blind guess. The macros recognize the AST nodes for the async stuff, but since I'm not quite sure what should be done with them, they try to do something similar as to the same "basic" (no async) node. Needless to say, there are probably bugs.)

Negative index handling

Make sure unpythonic handles negative indices the same way as standard Python.

Most of the code should already be correct, but particularly, at least in the unpythonic.collections module (see _make_negidx_converter), I've accidentally modded by the length instead of just adding the length once when it gets a negative index.

(This may well be the only place that needs fixing. Probably still a good idea to grep for the % operator in all of the codebase.)

The current behavior may hide some IndexErrors in user code (e.g. trying to access lst[-8] when len(lst) == 5), and should be considered a bug.

Rename setescape/escape to catch/throw

For the feature provided by setescape/escape, the standard names in Lisps are catch/throw (no relation to exception handling). E.g. Emacs Lisp, as well as some ancestors of Common Lisp have them.

The CATCH/THROW construct is still present in CL, too, but there it's considered more idiomatic to use the more modern, lexically scoped counterpart BLOCK/RETURN-FROM, since that is easier to reason about statically.

See Peter Seibel: Practical Common Lisp, chapter 20.

The standard names catch/throw are well-known, short and descriptive, so it is preferable to move to use them. In the context of a selected parts of Lisp for Python library such as unpythonic, the names shouldn't be expected to cause any confusion with exception handling, but it's good practice to warn in the docs (and docstrings) anyway, for users coming from other backgrounds.

Let's provide an alias for now, along with a DeprecationWarning for our previous non-standard names; and then drop the non-standard names later, in 0.15.0.

namedlambda: name kwargs

Could add support for naming literal lambdas passed as named function arguments, since that's another binding construct.

Generators raising StopIteration, should return instead

This affects at least the helper gfunc windowed in unpythonic.it.window. In this case the fix is:

    def windowed():
        while True:
            yield tuple(xs)
            xs.popleft()
            try:
                xs.append(next(it))
            except StopIteration:
                return

Grep the code for comments along the lines of let StopIteration propagate, or maybe just StopIteration to be sure we catch them all.

Change the syntax of raisef

It would be more pythonic to raisef(RuntimeError("ouch!")) instead of the current syntax raisef(RuntimeError, "ouch!") - then it's exactly the same as raise, except it's a function call.

We should probably also support a cause parameter to invoke raise from.

Since this is a backward compability breaking change, it has to wait until the next major release.

In the meantime, we can support both syntaxes (and deprecate the old one), by checking if we got only one positional argument x, for which isinstance(x, Exception). Otherwise use the old behavior.

Clean up frozen-instance code

See dataclasses, new in Python 3.7. (This pretty much fills the same role as MacroPy's case classes.)

We could raise dataclasses.FrozenInstanceError where appropriate.

Also, e.g. unpythonic.llist.cons doesn't need the internal readonly flag; it's simpler to just call object.__setattr__ to init the read-only fields, so the custom __setattr__ can be simplified. (Should check also other datatypes that use this pattern, maybe something in unpythonic.collections?)

Also, must intercept also __delattr__ to properly emulate immutability. Not sure if we currently do this (probably not - hence the bug label, we need to check).

Python 3.8 support for macro code

In the Python 3.8 AST, all constants are now represented by the ast.Constant node type.

Thus to support 3.8 and later properly, we will need to update any macro code that deals with Num, Str, NameConstant.

Maybe in 0.15, move these into syntax.astcompat, and start accepting Constant nodes as an alternative.

There are not many use sites, so it's probably not worth building an abstraction for this.

Improve unpythonic.fix.fix

If we ditch the internal memoizer, and use unpythonic.fun.memoize instead, fix simplifies to:

def fix(bottom=typing.NoReturn, memo=True):
    if bottom is typing.NoReturn or not callable(bottom):
        bottom = const(bottom)
    def decorator(f):
        f_memo = memoize(f) if memo else f  # TODO: thread safety?
        @wraps(f)
        def f_fix(*args, **kwargs):
            e = _get_threadlocals()
            me = (f_fix, args, tuple(sorted(kwargs.items(), key=itemgetter(0))))
            mrproper = not e.visited  # on outermost call, scrub visited clean at exit
            if not e.visited or me not in e.visited:
                try:
                    e.visited.add(me)
                    return f_memo(*args, **kwargs)
                finally:
                    e.visited.clear() if mrproper else e.visited.remove(me)
            else:  # cycle detected
                return bottom(f_fix.__name__, *args, **kwargs)
        f_fix.entrypoint = f  # just for information
        return f_fix
    return decorator

This version retains the property that it re-uses known results also in the middle of a call chain.

As for thread safety, memoize itself does nothing special, and the memo is shared between threads. At most, this leads to a few spurious overwrites in the memo with new instances of the same value (because f is assumed pure), until all threads agree that the key already exists (at which point the memo actually kicks in). The actual dictionary write is one bytecode instruction, so it is atomic. See SO.

The memo lives in the closure of f_fix. Exceptions are memoized, too. We call bottom at most once for each discovered infinite loop. Even if bottom blows up, the exception just ends up in the memo (there is at least one f_memo call active when bottom is called, because e.visited is empty at the start of the call chain).

We no longer need unwrap, since there's no need to inspect the return value. Reading the original code again, I'm no longer sure what n and the while loop are for, as they only surround the outermost call, but at least for fix to work as advertised in unpythonic they're not needed.

The important property used by fix is that our call to f (or to f_memo) will call the original function, but when the original function calls itself, it actually calls our f_fix because name lookup is dynamic.

TODO: Merge this in, think a bit more, and then update the docs. Think about TCO.

Uniquely identifying a set of arguments

We currently identify a set of arguments (for caching purposes), passed in as *args and **kwargs, using the following k as the key to a cache dictionary:

from operator import itemgetter
k = (args, tuple(sorted(kwargs.items(), key=itemgetter(0))))

(This functionality is only relevant for pure functions. Assume also that all relevant data is hashable.)

However, consider:

def f(a):
    ...

f(42)    # --> args = (42,), kwargs = {}
f(a=42)  # --> args = (), kwargs = {a: 42}

The value of k will be different, even though exactly the same arguments are bound in the body of f! (Namely, in both cases, the formal parameter a has been bound to the value 42.)

This affects at least @fix, @memoize and @gmemoize, and at least in an ideal world, should be considered a bug. Grep the codebase for "kwargs.items" to find all instances.

This is posted as #discussion and attached to the ∞ milestone mainly because at the moment of this writing, I don't yet have a plan.

So, if you're reading this and want to take it on: the question is, how to compute k such that the computed value (or its hash) uniquely represents a set of argument bindings? We can consider f as fixed; all use cases of this pattern are in decorators that already know which function they're being applied to.

Suggestions are welcome - it would be nice to fix this.

This gotcha was noticed by the author of the wrapt library, and is mentioned in its documentation. See the paragraphs beginning "If needing to modify certain arguments..." and "You should not simply attempt to extract positional arguments..."

Make implementation of chunked more pythonic

Using ideas from Svein Lindal and OrangeDog on StackOverflow, we can shorten chunked by one line:

def chunked(n, iterable):
    it = iter(iterable)
    try:
        while True:
            cit = islice(it, n)
            yield scons(next(cit), cit)
    except StopIteration:
        return

Teaching note: for a new Python user to understand this, the important points are that both it and cit advance when we read from cit, and the manually applied next(cit) lets us actually see the StopIteration when the first empty slice occurs. Beside extracting an item, there is no way to check whether an iterator is empty. (There can't be - in general, an iterator is just a suspended function, and whatever happens when its execution is eventually resumed, is what dynamically determines whether there are any more items.) islice, on its part, would just happily keep producing empty slices forver.

This version accounts for that as of Python 3.6 generators should return instead of raising StopIteration, and while not that much shorter, looks more pythonic than the code added in 994dd55. (Here the exceptional condition is on the outside, for a clearer presentation.)

Generalize return values

As of 0.14.1, in many places involving function composition, unpythonic uses the convention that returning a tuple means returning multiple values positionally. These are then unpacked, positionally, as the arguments of the next function in the chain.

From a composition viewpoint, what is currently lacking are named return values, in symmetry with named arguments.

The syntax could be something like this:

def f(...):
    ...
    return values(my, positional, values), named("my": ..., "named": ..., "values": ...)

So instead of just a tuple, accept any of:

  • values (semantic wrapper for tuple) ⇒ positional values
  • named (semantic wrapper for dict) ⇒ named values
  • length-2 tuple of (values or any type, named) ⇒ both positional and named values
  • else ⇒ one positional return value, the usual case

The semantic wrappers could be:

class values(tuple):
    """Multiple return values. Wrapper for tuple."""
class named(dict):
    """Named return values. Wrapper for dict."""

The important points are to make it possible to tell these apart from the respective base types using isinstance, yet let them act just like the corresponding base type (including passing any isinstance checks) in code that doesn't need this distinction.

This would be a breaking change, so 0.15.0 is the earliest where we could do this. This would improve the semantics, but needs updates to everything across the codebase where multiple return values are handled (grep the codebase for multiple, including any comments). The most fearsome part requiring changes is the continuations macro.

The lazify macro also needs some changes; lazyrec needs to recognize the values and named constructors. Adding them to the list of recognized constructors should be enough; with exactly the same settings as the builtin tuple and dict.

Probably best to start by changing the convention of a tuple return value to use an explicit values instead. This would already improve the semantics, allowing us to get rid of many 1 variants of functions (composer1, iterate1, etc.). If the old semantics are needed for compatibility reasons, they can be restored with a decorator:

def valuify(f):
    """Decorator. If `f` returns `tuple`, cast into `values`, else pass through."""
    @wraps(f)
    def valuified(*args, **kwargs):
        result = f(*args, **kwargs)
        if type(result) is tuple:  # yes, exactly tuple
            result = values(result)
        return result
    return valuified

@valuify
def f():
    return (1, 2, 3)

assert isinstance(f(1, 2, 3), values)

Add unpythonic.it.pad

Occasionally, padding with a fill value is needed for things such as chunking. Currently, this is what we get:

from unpythonic import chunked

chunks = chunked(3, range(9))
assert [tuple(chunk) for chunk in chunks] == [(0, 1, 2), (3, 4, 5), (6, 7, 8)]
chunks = chunked(3, range(7))
assert [tuple(chunk) for chunk in chunks] == [(0, 1, 2), (3, 4, 5), (6,)]

What if we wanted [(0, 1, 2), (3, 4, 5), (6, None, None)], for example to be able to always a, b, c = chunk?

To keep things orthogonal, it could be useful to provide a separate pad(n, fillvalue, iterable) function, which pads an iterable with copies of fillvalue if its length is smaller than n. Then we could write:

from unpythonic import chunked, pad  # "pad" doesn't exist yet

chunks = chunked(3, range(7))
chunks = [pad(3, None, chunk) for chunk in chunks]
assert [tuple(chunk) for chunk in chunks] == [(0, 1, 2), (3, 4, 5), (6, None, None)]

Or even:

from unpythonic import chunked, curry, pad  # "pad" doesn't exist yet

chunks = chunked(3, range(7))
pad_to_3_with_None = curry(pad, 3, None)
chunks = [pad_to_3_with_None(chunk) for chunk in chunks]
assert [tuple(chunk) for chunk in chunks] == [(0, 1, 2), (3, 4, 5), (6, None, None)]

This can be implemented in terms of itertools.islice and collections.deque, a bit similarly to butlastn or lastn.

Fix flake8 warnings as far as sensible

Especially in the parts of unpythonic that are not macro-related.

There's at least a variable name l in unpythonic.collections, and some nonstandard uses of whitespace all around.

Not conforming to PEP8 (as far as sensible; remember "Foolish consistency...") should be considered a bug.

map() override for curry-friendliness

The builtin map accepts zero iterables as input, but obviously does nothing useful unless at least one iterable is provided.

It would be much more curry-friendly to encode this fact in the call signature, requiring at least one iterable.

To go with the rest of the industrial-strength (read Racket-inspired) iterable tools, we could provide a trivial wrapper in unpythonic.it to do exactly this:

builtin_map = map
def map(function, iterable0, *iterables):
    return builtin_map(function, iterable0, *iterables)

This way curry(map, f) would know not to trigger the call until at least one iterable is present.

The downside is that from unpythonic import map looks intimidating - someone's overriding a builtin, halp! Even though in reality it's trivial and completely safe, the reader must look at one more definition to be sure.

Roadmap planning

Decisions, decisions...

  • Feature-freeze 0.14.2 now?
    • No: some more low-hanging fruit? #36? #11?
    • Yes: we already have a nice set of new functionality, not to mention the improved docs.
  • Rename 0.14.2 as 1.0 and start using semantic versioning?
    • If so, update the roadmap: 0.14.x → 1.x, 0.15 → 2.0.

Rename lazycall

The current name lazycall is misleading (hinders readability, hence a bug). Based on what it actually does, maybe_force_args or something similar would be more descriptive.

This is an internal implementation detail, so it could be renamed already in 0.14.2.

Injecting variables into locals

In short: in recent Pythons, impossible, by design. This is probably a good thing.

In Python ≤ 3.6, as discussed on stupidpythonideas, there was PyFrame_LocalsToFast in the CPython C API, which could be accessed via ctypes. This no longer works in 3.7 and later (PEP 558).

The alternative of using inspect.stack() and mutating the f_locals attribute of the frame object doesn't work, either.

See also Everything in Python is mutable.

(This issue is intentionally left open to collect links to related reading.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.