Oh shit! I've just discovered that the tree representation I've crea

This is the current architecture: <a target="_blank" rel="noopener n

You might ping <a class="user-mention notranslate" data-hovercard-type="user" data-hov

All the examples I've shown that use map , <code class

Interfacing directly with the compiler,about nicolas-p/skov

Comments (19)

nicolas-p commented on June 15, 2024

This is the current architecture:

This is what I am planning:

What is interesting is that there is now a path from Factor code to the user, which means that all currently existing Factor words could be displayed in the help browser as graphs and modified by the user. This would create the illusion that the entire system was implemented in Skov from the start.

from skov.

mrjbq7 commented on June 15, 2024

That's very cool! I hope it works!

The compiler has a few other tags in the tree that you might not care about or would have to translate around, for example when inlining takes place, this word:

: contents ( -- seq ) input-stream get stream-contents ; inline

actually becomes:

: contents ( -- seq ) input-stream 0 context-object assoc-stack stream-contents

because get inlines as CONTEXT-OBJ-NAMESTACK context-object assoc-stack

So the tree looks like:

V{
    T{ #push { literal input-stream } { out-d { 7869178 } } }
    T{ #push { literal 0 } { out-d { 7869179 } } }
    T{ #call
        { word context-object }
        { in-d V{ 7869179 } }
        { out-d { 7869180 } }
    }
    T{ #declare { declaration { { 7869180 vector } } } }
    T{ #call
        { word assoc-stack }
        { in-d V{ 7869178 7869180 } }
        { out-d { 7869181 } }
    }
    T{ #call
        { word stream-contents }
        { in-d V{ 7869181 } }
        { out-d { 7869182 } }
    }
    T{ #return { in-d V{ 7869182 } } }
}

You'll also find #phi, #recursive, #copy, #shuffle, and other tags that are used by the compiler, so it's a great idea and also has some caveats... !

from skov.

nicolas-p commented on June 15, 2024

I looked a bit deeper and my fears are confirmed: it would be trivial to generate the graph for the example above, but when there are lambdas and inline words, I don't know what to do.

Ideally, for this code:

I would like to give this to the compiler:

V{
    T{ #push 
        { literal { 1 2 3 } } 
        { out-d { 0000001 } }
    }
    T{ #push 
        { lambda 
            V{ 
                T{ #introduce 
                    { out-d { 0000011 } } 
                }
                T{ #call
                    { word neg }
                    { in-d V{ 0000011 } }
                    { out-d { 0000012 } }
                }
                T{ #return 
                    { out-d { 0000012 } } 
                }
            }
        } 
        { out-d { 0000002 } }
    }
    T{ #call
        { word map }
        { in-d V{ 0000001 0000002 } }
        { out-d { 0000003 } }
    }
    T{ #call
        { word display }
        { in-d V{ 0000003 } }
        { out-d { } }
    }
    T{ #return { in-d V{ } } }
}

I don't know if there's any (simple) way to make the compiler understand this.

from skov.

mrjbq7 commented on June 15, 2024

You might ping @bjourne for his thoughts on (ab)using the compiler tree this way. :-)

from skov.

mrjbq7 commented on June 15, 2024

Plus he might not have seen Skov yet!

from skov.

bjourne commented on June 15, 2024

I have seen it. :) But I haven't had time to look at it so I haven't come up with some meaningful feedback. @nicolas-p, Factor doesn't have lambdas it as quotations and those are literals too. They are like regular arrays with some extra metadata. So you can just emit:

V{
    T{ #push { literal { 1 2 3 } } { out-d { 8590649 } } }
    T{ #push { literal [ 3 + ] } { out-d { 8590650 } } }
    T{ #call
        { word map }
        { in-d V{ 8590649 8590650 } }
        { out-d { 8590651 } }
    }
    T{ #return { in-d V{ 8590651 } } }
}

The compiler should be able to take that and make working code from it.

from skov.

nicolas-p commented on June 15, 2024

In your example, I have circled the quotation part in blue:

The problem is that the quoted part of the graph is (obviously) represented in the same form as the rest of the graph. There is a "3" node with an output number, and an "add" node with an input number and an output number.

I can't just convert this to [ 3 add ]. On a simple example like this, it wouldn't be hard, but In the general case, I would need to be able to convert arbitrarily complex graphs to Factor code (not using locals). I can't do this, and this would be against what I'm trying to do, which is simply giving the original graph to the compiler.

An example of something more complex, with nested quotations:

from skov.

bjourne commented on June 15, 2024

You'll probably need to support subtrees otherwise in your example it is not clear why [ 3 add ] is the quotation. E.g you want it to represent { 1 2 3 } [ 3 add ] map but it could also mean { 1 2 3 } 3 add map.

from skov.

nicolas-p commented on June 15, 2024

All the examples I've shown that use map, reject, change file lines etc. already run perfectly fine. The system is able to determine that reject expects a quotation as its second input, so it creates a lambda over the entire sub-tree that is connected to this input.

from skov.

nicolas-p commented on June 15, 2024

I have tried to see the machine code generated for the simple example that we discussed above.

This is when Factor does things normally:

That's a lot of stuff because map is inlined.

Now I redefine build-tree to output this tree directly instead of doing any computation:

V{
    T{ #push { literal { 1 2 3 } } { out-d { 8590649 } } }
    T{ #push { literal [ 10 + ] } { out-d { 8590650 } } }
    T{ #call
        { word map }
        { in-d V{ 8590649 8590650 } }
        { out-d { 8590651 } }
    }
    T{ #return { in-d V{ 8590651 } } }
}

And this is the resulting machine code:

There's a lot less stuff because map is not inlined, am I right?
I have checked that this runs and gives the expected result.

So if I managed to generate this tree from Skov and give it directly to the compiler, bypassing build-tree, the code would run but words declared inline would never actually be inlined, and the performance would be degraded, am I still right?

So starting from the tree given above, is there a word that would transform it to another tree with map inlined? I have tried words like do-inlining or inline-word but think I didn't use them properly.

from skov.

nicolas-p commented on June 15, 2024

The optimizing compiler doesn't process quotations, it just pushes them like other sequences. Quotations are later compiled by the non-optimizing compiler in the VM. This is a big problem for what I want to do. Would it be possible to make the optimizing compiler compile everything including (nested) quotations? Is this a stupid idea?

from skov.

bjourne commented on June 15, 2024

The optimizing compiler is written in factor, so when you compile it you need the non-optimizing compiler to break the chicken-and-the-egg loop. But given the sequence you are able to produce, I really don't understand why you don't write:

your-seq [ 
    dup class-of { 
        { #push [ literal>> ] } { #call [ word>> ] } [ 2drop f ] 
    } case 
] map sift >quotation

The resulting quotation is optimizable by factor.

from skov.

nicolas-p commented on June 15, 2024

I tried to find a good example to show the difficulty of converting Skov code into a Factor quotation like you proposed.

I define a word called thing:

Calling it with the following parameters:

... gives the following result:

The two difficulties are:

Results in Skov can be used several times directly (like the n input above). If I wanted to generate a Factor quotation, I would need to automatically add stack shufflers or combinators.
Lexical scoping has to work. In the example above, n is an input of thing, but it is used inside the quotation given to map.

The example above works in the current implementation (which generates Factor code using the locals vocabulary). My dream would be to retain the same functionality without generating Factor code.

Also, the current implementation processes the thing word and the quotation given to map in the same way (an ugly way, but still the same way). If non-quoted code can be given to the compiler directly as a graph but quoted code needs special processing to be somehow turned into a sequence of words, there's no more symmetry.

Please tell me if I haven't been clear in my answer, it hasn't been easy to write.

from skov.

nicolas-p commented on June 15, 2024

I have finally implemented direct compilation. Quotations are not yet supported but I have an idea for this. When the system encounters what I call a quoted input, it will define a separate word for the graph portion connected to that input. This word will be compiled in exactly the same way as the outer word and then put inside a quotation. This quotation will be linked to the quoted input.

There is a comment in the VM code that says that when the VM encounters a quotation with only one word inside, it will directly call the machine code for that word.

So by doing this, I will be able to compile everything directly, including quotations. There will be no need to generate Factor code at any point. And there will be no need to modify the VM.

from skov.

nicolas-p commented on June 15, 2024

Quotations are now supported, including nested quotations.

from skov.

mrjbq7 commented on June 15, 2024

Very cool!!!

from skov.

nicolas-p commented on June 15, 2024

I'm reopening this issue because it doesn't work as expected.

Here's a simple example to show the problem. I'm pushing the literal { 1 2 3 4 } and calling first4 on it to get four outputs. Then I want to call 2array on some of these outputs and print the result.

If I just try to group 1 and 3 together into 2array I get the expected result:

f
{ 1 2 3 4 } 0 <#push> suffix
{ 0 } { 1 2 3 4 } \ first4 <#call> suffix
{ 1 3 } { 5 } \ 2array <#call> suffix
{ 5 } f \ . <#call> suffix
f <#return> suffix
{ } { } <effect>
[ define-temp ] with-compilation-unit
execute

{ 1 3 }

Now I still group 1 and 3 together but then I also group 2 and 4 together. The result is unexpected:

f
{ 1 2 3 4 } 0 <#push> suffix
{ 0 } { 1 2 3 4 } \ first4 <#call> suffix
{ 1 3 } { 5 } \ 2array <#call> suffix
{ 5 } f \ . <#call> suffix
{ 2 4 } { 6 } \ 2array <#call> suffix
{ 6 } f \ . <#call> suffix
f <#return> suffix
{ } { } <effect>
[ define-temp ] with-compilation-unit
execute

{ 3 4 }
{ 1 2 }

In the first example, input and output numbers are meaningful. first4 gives me four things and if I want to take only the first and the third, there's no problem.

In the second example, input and output numbers become meaningless. The first call to 2array takes two numbers off the stack, and the second call does the same. It is equivalent to this Factor code:
{ 1 2 3 4 } first4 2array . 2array .

@mrjbq7, @bjourne, could you explain to me why there is this difference between the two examples? Is it because of an optimization phase?

By the way, the above examples won't run in standard Factor. If you want to test them, you'll have to redefine build-tree:

IN: compiler.tree.builder
: build-tree ( word/quot -- nodes )
    def>> ;

from skov.

nicolas-p commented on June 15, 2024

I now understand why my approach didn't work. I found the answer in this article by Slava Pestov:

As before, the compiler IR is essentially stack code which is annotated with values, and with symbolic shuffles instead of shuffle words (for example, over is represented as something like a b -> a b a). A key property is that if we strip out all the value annotations, we again end up with stack code; [...] Even the code generator ignores the value annotations and only looks at the stack code; so the value annotations are only used by the optimizer itself, as it tracks the stack flow of values between words; and even the optimizer could dispense of them, at the cost of some algorithmic complexity.

I had expected the value annotations to be meaningful to the entire compiler, but it's not the case. In the two examples that I posted last week, the first one only works because of the dead-code elimination phase, which removes values 2 and 4 so that when 2array takes two numbers off the stack, it takes 1 and 3.

So now I have two solutions:

I continue with this approach but I have to add the right #shuffle nodes in my tree to rearrange the stack between each #call
I go back to using the locals vocabulary, but I don't want to generate text strings anymore, so I'm thinking of generating quotations like this: [ { 1 2 3 4 } :> #0 #0 first4 :> ( #1 #2 #3 #4 ) #1 #3 2array :> #5 #2 #4 2array :> #6 #5 . #6 . ]

The second solution is probably better and I already know it works.

from skov.

nicolas-p commented on June 15, 2024

I went for option 2 and I found a beautiful way to implement it. I use <local>, <multi-def> and <lambda> to generate a lambda, and I call rewrite-closures to get stack code.

So I'm not hooked to the compiler directly (as I had hoped), but it's still a nice implementation. It is close to the original implementation but much cleaner because I don't generate text strings to be parsed.

from skov.

Interfacing directly with the compiler about skov HOT 19 CLOSED

Comments (19)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs