austinzheng / cormorant Goto Github PK
View Code? Open in Web Editor NEWClojure(ish) interpreter in Swift
Clojure(ish) interpreter in Swift
The interpreter should be able to specify inputs and outputs.
Vector should be supported in function position:
(["first" "second" "third"] 1) should return "second"
Interpreter should support the syntax-quote (`), unquote (), and unquote-splice (@) reader macros, including nested syntax-quotes.
Interpreter should support keywords, which are special literals defined using a leading colon: :key
, :someSymbol
. Support exists already in the lexer and parser, but currently none of the internal code is equipped to deal with keywords. Keywords should also be interned if possible.
Support for future
should be feasible via GCD. This is an experimental feature. It may also be possible to extend this feature to support promise
and deliver
.
Lazy sequences are one of Clojure's most powerful features. Unlike (e.g.) STM, a decent and useful implementation of lazy sequences should be feasible for Lambdatron.
Seems that no matter what test I try and run I always end up with this:
dyld`dyld_fatal_error:
0x7fff5fc0109c: int3
0x7fff5fc0109d: nop
running XCode 6.1.1 under OS X 10.9.5
last commit is
commit cb244f57f02db6ad19fc61d35dad5b0ae3779cba
Author: Austin Zheng <[email protected]>
Date: Sun Mar 8 15:57:07 2015 -0700
Adding unit tests
Changes:
- Added unit tests for the '.dissoc' function
- Fixed .dissoc's arity error message
'cons', 'first', and 'rest' should be built-in functions, not special forms. 'apply' should be a special form.
Right now String
s are being munged into NSString
s whenever it's necessary to perform operations on specific characters. At some point this should be fixed so that either NSString conventions or native Swift string conventions are used everywhere.
Now that there's a (very basic) unit test mechanism, unit tests should be written to cover as much of the implementation as possible.
If unquote or unquote-splice is applied on a composite form (e.g. ~(a b)
or ~@(a b)
), the complex form isn't properly handled if it in turn has elements that are quoted or syntax-quoted. The unit tests I just pushed should highlight this deficiency.
Should support gensym for easier macro definition.
Strings (e.g. when using the built-in describe API) aren't being escaped properly. They should be escaped the same way strings are in Clojure.
Macros need to capture their context when defined, as functions do right now, For example:
(let [a 10] (defmacro mac1 [arg1] (list '+ a arg1)))
should bind 10
to a
, no matter what context the macro is called in:
(let [a 12] (mac1 10)) ; returns 20, NOT 22
(mac1 500) ; returns 510; does not throw an error
Interpreter should support maps, defined with the { key1 value1 key2 value2 }
literal syntax. Maps should be valid in function position or as functions, as with Clojure.
There's a problem with 'apply'.
Once the list has been constructed as part of the call to 'apply', the elements inside the list should NOT be evaluated again before being passed to the function.
In Clojure:
user=> (seq (concat (list 'ABC) (list 'def)))
(ABC def)
user=> (apply vector (seq (concat (list 'ABC) (list 'def))))
[ABC def]
Here:
> (seq (concat (list 'ABC) (list 'def)))
(ABC def)
> (apply vector (seq (concat (list 'ABC) (list 'def))))
Evaluation error (InvalidSymbolError): could not resolve the symbol
Another example: user=> (apply +
((+ 1 2) 3))fails (attempted Cons cast to Number), since the list's items
(+ 1 2)and
3are NOT re-evaluated before being passed to
+` internally.
Right now >=
and <=
don't exist; they should be implemented as built-ins like the (extant) <
and >
. This is a very simple task.
After gensym is implemented, symbols postfixed by a #
within a syntax-quote should be reader-expanded into gensym.
Right now, error handling is implemented as a simple enum.
Instead, the error case should hold a struct with a dictionary and an error enum. The dictionary can contain domain-specific information that the interpreter can use to display a more helpful error message.
Interpreter should support sets, as well as a set API. Sets should be definable using the literal syntax #{ item1 item2 }
. This also involves making ConsValue
fully hashable and equatable. Implementation will probably use NSSet
, since Swift does not yet have a native set type or API.
Implement 'map' using the stdlib and/or as a built-in fn. (Lazy support not necessary for now.)
http://conj.io/store/org.clojure/clojure/1.7.0-alpha4/clojure.core/map/
Support dynamic scoping of vars through the binding
form.
Functions that prepend to a list (e.g. concat
) should be optimized so that, if there is no postpending, the list isn't copied.
Add support for the inline function reader macro (e.g. #(doSomething %0 %1)
). Expanding this macro is relatively straightforward, with the caveat that the expander needs to know know many %n
tokens there are (or if only %
is used).
A good resource is: http://stackoverflow.com/questions/13204993/anonymous-function-shorthand
Implement namespaces in Lambdatron per Clojure's pattern. Following is a rough sketch of work required to do so.
all-ns
and passed as arguments to other things. Note that Clojure doesn't seem to expose a way to refer literally to a namespace, and namespaces seem to only make sense in the global context.user/a
) and keywords (::a
).expands to
user/a` in the user namespace, but `example/a` in the example namespace.One of Clojure's most powerful features is pattern-based destructuring in the argument vectors for let
and fn
: http://blog.jayfields.com/2010/07/clojure-destructuring.html
Figure out how it works and implement.
A very brief examination of the program's memory and CPU usage characteristics using the Instruments tool seems to indicate that the vast majority of resource usage is due to the inefficient way symbols and bindings are represented. Symbols and bindings should be interred, and the string names should only be used if the user requests them.
Support metadata.
Right now, a problem while evaluating a form causes the interpreter to quit (due to placeholder asserts). Instead, the interpreter should show an informative error message and return to the 'read' portion of the loop.
Add support for the var
special form, deref
function, and the reader forms #'
(which expands to (var a)
) and @
(which expands to (deref a)
).
Unit tests should be moved from being an in-application module to the XCTest framework.
I think this is possible:
import Lambdatron
Will do some testing.
Add support for atoms. To quote the ClojureScript documentation: "Clojure's model of values, state, identity, and time is valuable even in single-threaded environments." However, atoms should be built in a way that proper concurrency support can easily be added if it ever comes to this project.
Lambdatron should support character literals like Clojure (e.g. \a, \b, \c), and associated functionality. Note that this is complicated by Swift's insane (e.g. properly Unicode-compliant) string handling.
Keyword and symbol should be supported in function position. Intended behavior:
> ('a {10 "hello"})
nil
> ('a {'a "hello"})
"hello"
> (:a {:a "hello"})
"hello"
> ('a {'b "hello"} "notfound")
"notfound"
> (:a {:b "hello"} "notfound")
"notfound"
(x map) should be an alias for (get map x), and (x map notfound) should be an alias for (get map x notfound).
Right now, reader macros (currently quote, syntax-quote, unquote, and unquote-splice) are implemented as a certain case of ConsValue
. The lexer and parser produce a ConsValue
containing unexpanded reader macros, which is then fed into an expansion function.
This has a couple of advantages. Since reader macros are treated as ordinary macros or functions in structure, the expander can take advantage of the tree structure of the ConsValue
. For example, ``(a b)becomes
(sq* (a b))`, which allows it to be processed as a normal list.
However, it also has some glaring disadvantages. The first is that reader macro objects are exposed to the public through ConsValue.ReaderMacro
. This exposes an implementation detail, since end users are never expected to be allowed to build code with the function-position forms of the reader macros.
The second is that reader macros should not all be expanded at the same time. Instead, there should be different types of reader macros, and different functions which expand different macros. However, implementing this would require even more cases on ConsValue
that should never have been exposed to the user in the first place.
A complication is that Cons
expects to hold a ConsValue
. This means it's difficult to have the reader output (e.g.) an UnexpandedConsValue
including list items. Cons
could be subclassed, but that would mean unsealing the class (which I'd rather not do).
The error system is tolerable right now, but could be significantly better. In particular, decent stacktraces should be possible via the interpreter's Context stack system. Not sure if line info metadata can be added, but that would be nice as well.
Macros shouldn't evaluate their arguments when being expanded. A macro's arguments should be inserted as literals into the list (or other form) being constructed by the macro, and only executed (if appropriate) when the macro's output form is executed after expansion is complete.
Interpreter should distinguish between integers and floating-point values, but this should be transparent to the user. Something like (+ 1.2 3 4.5 6)
should work as expected. Arithmetic that only uses integers should only use integer operations internally.
Implement 'reduce'.
http://conj.io/store/org.clojure/clojure/1.7.0-alpha4/clojure.core/reduce/
Implement 'filter'.
http://conj.io/store/org.clojure/clojure/1.7.0-alpha4/clojure.core/filter/
The function param system needs extensive work and correctness verification. Here are some things that should happen.
For a fn, a param is resolved to its base value even before the function takes control. So arg1 --> 10.
> (def a 10)
> (def b a)
> (defn foo [arg1] arg1)
> (foo b)
; foo sees 10
10
For a macro, a param maps directly to the value that was passed in. So arg1 --> b.
> (defmacro bar [arg1] arg1)
> (macroexpand '(bar b))
; bar sees 'b'
b
Once inside the fn, params resolved to symbols are never further resolved.
> (foo 'b)
b
> (macroexpand '(bar b))
b
If a function is passed a list, that list is eval'ed when the param values are being calculated:
> (foo (+ 1 2))
; foo sees 3
3
If a macro is passed a list, that list is treated verbatim: arg1 -> (+ 1 2)
> (macroexpand '(bar (+ 1 2)))
; bar sees (+ 1 2)
(+ 1 2)
Example A:
> (defn ftest [a b c] (a b c))
> (ftest + 1 2)
3
;; This one is weird. Not sure what it's doing. It always returns the last element.
;; I would expect it to execute (+ 1 2) and return 3 as the macroexpansion.
> (defmacro mtest [a b c] (a b c))
> (mtest + 1 2)
2
> (defmacro mtest2 [a b] (+ a b))
> (mtest2 1 2)
3
It looks like, for functions, a list (arg1 ...) [where arg1 is a function passed in as a param] is treated as a function call properly. But for macros, a list (arg1 ... last_item) just returns last_item. (This doesn't have anything to do with the code the macro is emitting. It's about how the macro runs computations internally.)
However, normal function definitions (where the item in function position isn't a param) work fine in a macro, as seen in mtest2.
Example B:
> (defmacro mtest [a b] (list [(+ a 1) (+ b 2)] [a b]))
> (macroexpand-1 '(mtest 10 20))
([11 22] [10 20])
It's clear that the code inside the macro is resolving a = 10 and b = 20 and then performing arithmetic on those values to generate the output.
However, if we try:
> (macroexpand-1 '(mtest (+ 5 5) 20))
we get an exception that says that a PersistentList
can't be cast to a Number
. Basically the macro body is attempting to eval (+ (+ 5 5) 1)
, but the (+ 5 5)
cannot be further eval'ed.
But...
> (defmacro mtest [a b c] [(+ a (+ b c))])
> (mtest 10 1 2)
[13]
This is fine, since it's completely evaluating (+ 10 (+ 1 2))
.
However,
> (mtest (+ 5 5) 1 2)
gives the same error. The body is (+ (+ 5 5) (+ 1 2))
, but the (+ 5 5)
cannot be evaluated (since it's an argument, an unevaluable list), while the (+ 1 2)
can be since it's part of the macro code.
A final example, one with vectors:
> (defmacro mtest2 [a b c] (list a [(+ b 1) (+ c 1)]))
> (macroexpand-1 '(mtest2 [(+ 12 1) (+ 13 1)] 12 13))
([(+ 12 1) (+ 13 1)] [13 14])
Again we note that the vector argument a
is treated as atomic; its constituents are not evaluated themselves.
Therefore, a few things need to happen:
An interesting phenomenon:
Say that a function takes args [x y & more]. Later on there is a recur statement of the form (recur y (first more) (rest more)).
Clojure binds the sequence formed by (rest more) directly to the vararg 'more', rather than making a list out of it (as might be expected).
This came up when trying to implement == as a stdlib function: the current 'recur' doesn't rebind directly, causing the list to become more and more nested. This should be fixed.
The current loader sucks, as does the global context system.
The loader needs to be rebuilt so that:
The attempt
special form should be replaced by a proper try-catch-finally, with reified exceptions holding EvalError
s and the ability to catch specific error types. This is necessary for building binding
or any of the concurrent var-related features.
Right now slots in a Context are simulated as a dictionary. This is wasteful, especially since:
Need to determine a better way to store bindings. Unfortunately Swift doesn't yet support fixed-length arrays.
Re-implement mod
using the Clojure stdlib definition:
(defn mod
[num div]
(let [m (rem num div)]
(if (or (zero? m) (= (pos? num) (pos? div)))
m
(+ m div))))
This means also implementing pos?
(and neg?
).
Also, implement rem
:
(defn rem
[num div]
(. clojure.lang.Numbers (remainder num div)))
Right now "interfaces" (not to be confused with Swift protocols) are implemented in an implicit, ad-hoc way. Need a better way to implement sequence, evalable, etc. functionality in a way compatible with sum types.
Depends on #13 . Implement 'gensym', including the built-in function itself, as well as support for the '#' reader macro.
Support regexes (probably using NSRegularExpression
).
#"regex"
should be parsed as a regexre-find
, re-seq
, re-matches
, re-pattern
, re-matcher
, re-groups
, replace
, replace-first
, re-quote-replacement
Right now, RecurSentinel is a ConsValue case, and must be awkwardly handled in many places throughout the interpreter. Investigate whether making it a third EvalResult case instead would clean up the code, and if so implement.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.