GithubHelp home page GithubHelp logo

Comments (19)

nicolas-p avatar nicolas-p commented on June 15, 2024

This is the current architecture:

capture d ecran 2016-04-10 a 12 54 16

This is what I am planning:

capture d ecran 2016-04-10 a 12 53 43

What is interesting is that there is now a path from Factor code to the user, which means that all currently existing Factor words could be displayed in the help browser as graphs and modified by the user. This would create the illusion that the entire system was implemented in Skov from the start.

from skov.

mrjbq7 avatar mrjbq7 commented on June 15, 2024

That's very cool! I hope it works!

The compiler has a few other tags in the tree that you might not care about or would have to translate around, for example when inlining takes place, this word:

: contents ( -- seq ) input-stream get stream-contents ; inline

actually becomes:

: contents ( -- seq ) input-stream 0 context-object assoc-stack stream-contents

because get inlines as CONTEXT-OBJ-NAMESTACK context-object assoc-stack

So the tree looks like:

V{
    T{ #push { literal input-stream } { out-d { 7869178 } } }
    T{ #push { literal 0 } { out-d { 7869179 } } }
    T{ #call
        { word context-object }
        { in-d V{ 7869179 } }
        { out-d { 7869180 } }
    }
    T{ #declare { declaration { { 7869180 vector } } } }
    T{ #call
        { word assoc-stack }
        { in-d V{ 7869178 7869180 } }
        { out-d { 7869181 } }
    }
    T{ #call
        { word stream-contents }
        { in-d V{ 7869181 } }
        { out-d { 7869182 } }
    }
    T{ #return { in-d V{ 7869182 } } }
}

You'll also find #phi, #recursive, #copy, #shuffle, and other tags that are used by the compiler, so it's a great idea and also has some caveats... !

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

I looked a bit deeper and my fears are confirmed: it would be trivial to generate the graph for the example above, but when there are lambdas and inline words, I don't know what to do.

Ideally, for this code:

capture d ecran 2016-04-10 a 21 10 07

I would like to give this to the compiler:

V{
    T{ #push 
        { literal { 1 2 3 } } 
        { out-d { 0000001 } }
    }
    T{ #push 
        { lambda 
            V{ 
                T{ #introduce 
                    { out-d { 0000011 } } 
                }
                T{ #call
                    { word neg }
                    { in-d V{ 0000011 } }
                    { out-d { 0000012 } }
                }
                T{ #return 
                    { out-d { 0000012 } } 
                }
            }
        } 
        { out-d { 0000002 } }
    }
    T{ #call
        { word map }
        { in-d V{ 0000001 0000002 } }
        { out-d { 0000003 } }
    }
    T{ #call
        { word display }
        { in-d V{ 0000003 } }
        { out-d { } }
    }
    T{ #return { in-d V{ } } }
}

I don't know if there's any (simple) way to make the compiler understand this.

from skov.

mrjbq7 avatar mrjbq7 commented on June 15, 2024

You might ping @bjourne for his thoughts on (ab)using the compiler tree this way. :-)

from skov.

mrjbq7 avatar mrjbq7 commented on June 15, 2024

Plus he might not have seen Skov yet!

from skov.

bjourne avatar bjourne commented on June 15, 2024

I have seen it. :) But I haven't had time to look at it so I haven't come up with some meaningful feedback. @nicolas-p, Factor doesn't have lambdas it as quotations and those are literals too. They are like regular arrays with some extra metadata. So you can just emit:

V{
    T{ #push { literal { 1 2 3 } } { out-d { 8590649 } } }
    T{ #push { literal [ 3 + ] } { out-d { 8590650 } } }
    T{ #call
        { word map }
        { in-d V{ 8590649 8590650 } }
        { out-d { 8590651 } }
    }
    T{ #return { in-d V{ 8590651 } } }
}

The compiler should be able to take that and make working code from it.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

In your example, I have circled the quotation part in blue:

capture d ecran 2016-04-11 a 20 17 58

The problem is that the quoted part of the graph is (obviously) represented in the same form as the rest of the graph. There is a "3" node with an output number, and an "add" node with an input number and an output number.

I can't just convert this to [ 3 add ]. On a simple example like this, it wouldn't be hard, but In the general case, I would need to be able to convert arbitrarily complex graphs to Factor code (not using locals). I can't do this, and this would be against what I'm trying to do, which is simply giving the original graph to the compiler.

An example of something more complex, with nested quotations:

capture d ecran 2016-04-11 a 20 37 47

from skov.

bjourne avatar bjourne commented on June 15, 2024

You'll probably need to support subtrees otherwise in your example it is not clear why [ 3 add ] is the quotation. E.g you want it to represent { 1 2 3 } [ 3 add ] map but it could also mean { 1 2 3 } 3 add map.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

All the examples I've shown that use map, reject, change file lines etc. already run perfectly fine. The system is able to determine that reject expects a quotation as its second input, so it creates a lambda over the entire sub-tree that is connected to this input.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

I have tried to see the machine code generated for the simple example that we discussed above.

This is when Factor does things normally:
capture d ecran 2016-04-16 a 23 10 07
That's a lot of stuff because map is inlined.

Now I redefine build-tree to output this tree directly instead of doing any computation:

V{
    T{ #push { literal { 1 2 3 } } { out-d { 8590649 } } }
    T{ #push { literal [ 10 + ] } { out-d { 8590650 } } }
    T{ #call
        { word map }
        { in-d V{ 8590649 8590650 } }
        { out-d { 8590651 } }
    }
    T{ #return { in-d V{ 8590651 } } }
}

And this is the resulting machine code:
capture d ecran 2016-04-16 a 23 07 19
There's a lot less stuff because map is not inlined, am I right?
I have checked that this runs and gives the expected result.

So if I managed to generate this tree from Skov and give it directly to the compiler, bypassing build-tree, the code would run but words declared inline would never actually be inlined, and the performance would be degraded, am I still right?

So starting from the tree given above, is there a word that would transform it to another tree with map inlined? I have tried words like do-inlining or inline-word but think I didn't use them properly.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

The optimizing compiler doesn't process quotations, it just pushes them like other sequences. Quotations are later compiled by the non-optimizing compiler in the VM. This is a big problem for what I want to do. Would it be possible to make the optimizing compiler compile everything including (nested) quotations? Is this a stupid idea?

from skov.

bjourne avatar bjourne commented on June 15, 2024

The optimizing compiler is written in factor, so when you compile it you need the non-optimizing compiler to break the chicken-and-the-egg loop. But given the sequence you are able to produce, I really don't understand why you don't write:

your-seq [ 
    dup class-of { 
        { #push [ literal>> ] } { #call [ word>> ] } [ 2drop f ] 
    } case 
] map sift >quotation

The resulting quotation is optimizable by factor.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

I tried to find a good example to show the difficulty of converting Skov code into a Factor quotation like you proposed.

I define a word called thing:

capture d ecran 2016-05-11 a 21 51 12

Calling it with the following parameters:

capture d ecran 2016-05-11 a 21 51 29

... gives the following result:

capture d ecran 2016-05-11 a 21 51 42

The two difficulties are:

  • Results in Skov can be used several times directly (like the n input above). If I wanted to generate a Factor quotation, I would need to automatically add stack shufflers or combinators.
  • Lexical scoping has to work. In the example above, n is an input of thing, but it is used inside the quotation given to map.

The example above works in the current implementation (which generates Factor code using the locals vocabulary). My dream would be to retain the same functionality without generating Factor code.

Also, the current implementation processes the thing word and the quotation given to map in the same way (an ugly way, but still the same way). If non-quoted code can be given to the compiler directly as a graph but quoted code needs special processing to be somehow turned into a sequence of words, there's no more symmetry.

Please tell me if I haven't been clear in my answer, it hasn't been easy to write.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

I have finally implemented direct compilation. Quotations are not yet supported but I have an idea for this. When the system encounters what I call a quoted input, it will define a separate word for the graph portion connected to that input. This word will be compiled in exactly the same way as the outer word and then put inside a quotation. This quotation will be linked to the quoted input.

There is a comment in the VM code that says that when the VM encounters a quotation with only one word inside, it will directly call the machine code for that word.

So by doing this, I will be able to compile everything directly, including quotations. There will be no need to generate Factor code at any point. And there will be no need to modify the VM.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

Quotations are now supported, including nested quotations.

from skov.

mrjbq7 avatar mrjbq7 commented on June 15, 2024

Very cool!!!

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

I'm reopening this issue because it doesn't work as expected.

Here's a simple example to show the problem. I'm pushing the literal { 1 2 3 4 } and calling first4 on it to get four outputs. Then I want to call 2array on some of these outputs and print the result.

If I just try to group 1 and 3 together into 2array I get the expected result:

f
{ 1 2 3 4 } 0 <#push> suffix
{ 0 } { 1 2 3 4 } \ first4 <#call> suffix
{ 1 3 } { 5 } \ 2array <#call> suffix
{ 5 } f \ . <#call> suffix
f <#return> suffix
{ } { } <effect>
[ define-temp ] with-compilation-unit
execute
{ 1 3 }

Now I still group 1 and 3 together but then I also group 2 and 4 together. The result is unexpected:

f
{ 1 2 3 4 } 0 <#push> suffix
{ 0 } { 1 2 3 4 } \ first4 <#call> suffix
{ 1 3 } { 5 } \ 2array <#call> suffix
{ 5 } f \ . <#call> suffix
{ 2 4 } { 6 } \ 2array <#call> suffix
{ 6 } f \ . <#call> suffix
f <#return> suffix
{ } { } <effect>
[ define-temp ] with-compilation-unit
execute
{ 3 4 }
{ 1 2 }

In the first example, input and output numbers are meaningful. first4 gives me four things and if I want to take only the first and the third, there's no problem.

In the second example, input and output numbers become meaningless. The first call to 2array takes two numbers off the stack, and the second call does the same. It is equivalent to this Factor code:
{ 1 2 3 4 } first4 2array . 2array .

@mrjbq7, @bjourne, could you explain to me why there is this difference between the two examples? Is it because of an optimization phase?

By the way, the above examples won't run in standard Factor. If you want to test them, you'll have to redefine build-tree:

IN: compiler.tree.builder
: build-tree ( word/quot -- nodes )
    def>> ;

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

I now understand why my approach didn't work. I found the answer in this article by Slava Pestov:

As before, the compiler IR is essentially stack code which is annotated with values, and with symbolic shuffles instead of shuffle words (for example, over is represented as something like a b -> a b a). A key property is that if we strip out all the value annotations, we again end up with stack code; [...] Even the code generator ignores the value annotations and only looks at the stack code; so the value annotations are only used by the optimizer itself, as it tracks the stack flow of values between words; and even the optimizer could dispense of them, at the cost of some algorithmic complexity.

I had expected the value annotations to be meaningful to the entire compiler, but it's not the case. In the two examples that I posted last week, the first one only works because of the dead-code elimination phase, which removes values 2 and 4 so that when 2array takes two numbers off the stack, it takes 1 and 3.

So now I have two solutions:

  • I continue with this approach but I have to add the right #shuffle nodes in my tree to rearrange the stack between each #call
  • I go back to using the locals vocabulary, but I don't want to generate text strings anymore, so I'm thinking of generating quotations like this: [ { 1 2 3 4 } :> #0 #0 first4 :> ( #1 #2 #3 #4 ) #1 #3 2array :> #5 #2 #4 2array :> #6 #5 . #6 . ]

The second solution is probably better and I already know it works.

from skov.

nicolas-p avatar nicolas-p commented on June 15, 2024

I went for option 2 and I found a beautiful way to implement it. I use <local>, <multi-def> and <lambda> to generate a lambda, and I call rewrite-closures to get stack code.

So I'm not hooked to the compiler directly (as I had hoped), but it's still a nice implementation. It is close to the original implementation but much cleaner because I don't generate text strings to be parsed.

from skov.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.