GithubHelp home page GithubHelp logo

Comments (7)

davesque avatar davesque commented on September 28, 2024 2

CC: @cburgdorf @ralexstokes

There's my brain dump as promised. Please let me know your thoughts on all this or if you need any clarifications.

from fe.

ralexstokes avatar ralexstokes commented on September 28, 2024 1

@davesque i would preface w/ "keep it simple" to prioritize moving forward over finding the "best" solution... in this spirit i think the Spanned<T> type accomplishes your goals and keeps thing moving.

if you wanted to consider it, i think your idea w/ the "separate data structure makes sense"; AST nodes will still need to be decorated w/ where in the source text they came from but that could simply be indices into a source str contained in some kind of Context that has e.g. some pseudo-rust:

struct Context<'s> {
    ast: AST<(probably) 's>,
    source: &'s str,
}

then on some error, you would essentially resolve all of this into some nice error data and project to standard out (or wherever you like your error messages) :)

from fe.

davesque avatar davesque commented on September 28, 2024 1

Nice, thanks for taking a look! And I think you're right. I had envisioned that Span instances would only contain byte offsets without saying anything about what string is being offset into. As you say, that would be available in some other scope or data structure. I think I'll just move forward with that.

from fe.

davesque avatar davesque commented on September 28, 2024 1

@ralexstokes Funny you mention that. I've actually been diving around quite a bit in the code for the rust compiler itself. That's where I got the idea for the Spanned<T> struct: https://github.com/rust-lang/rust/blob/91642e3ac0120c8e9cdd5f3c85ad03f3bf1b8b69/src/libsyntax_pos/source_map.rs#L43-L46. Also, strangely, it actually seems that the rust compiler just manually declares a span field for every AST related type as can be seen all throughout the syntax::ast standard library module: https://github.com/rust-lang/rust/blob/91642e3ac0120c8e9cdd5f3c85ad03f3bf1b8b69/src/libsyntax/ast.rs. That was the first approach I took, which I had figured was a naive way of doing things. Anyway, it's been illuminating to look around and see how the rust compiler actually does things.

from fe.

davesque avatar davesque commented on September 28, 2024

Yesterday, I had the idea of defining the output type of parsers generically with trait bounds. I thought this might allow me to say statically at a parser's call site what result type I want and, therefore, whether or not the result should include information about spans. I saved my (incomplete) work in this commit.

This works, although using it would be impractical. Not only would I have to define all AST types generically with that trait bound, I would also have to specify the desired tree of AST types at a parser's call site. It might look something like this:

// Spanned result
let result = file_input::<
    Spanned<Module<'a,
        Spanned<ModuleStmt<'a,
            Spanned<EventField<'a,
                ...,
            >>,
        >>,
    >>,
    SimpleError<_>,
>(input);

// Unspanned result
let result = file_input::<
    Module<'a,
        ModuleStmt<'a,
            EventField<'a,
                ...,
            >,
        >,
    >,
    SimpleError<_>,
>(input);

This works because there are implementations for every AST type for the From<(ASTType, Span)> trait. So parsers only say that they'll produce some result which can be converted from and instance of the expected AST type and a Span instance that says where it was parsed from. The From implementation for Spanned types takes the bare type and the span and constructs a Spanned instance. The implementation for un-spanned types just discards the span instance and returns the bare type.

But that's not even a complete picture. Parsers would have to have a type parameter not just for the immediate output type but for each embedded type inside of the AST. So the call sites would look like above, but would also include extra concrete types for each embedded type. In short, it would be a complete nightmare to use.

Anyway, the idea's interesting but unusable.

from fe.

cburgdorf avatar cburgdorf commented on September 28, 2024

I had envisioned that Span instances would only contain byte offsets without saying anything about what string is being offset into.

I think that's a good approach to handle it 👍

from fe.

ralexstokes avatar ralexstokes commented on September 28, 2024

also worth mentioning to take a look at your favorite rust compiler project(s) and see how they handle this problem

from fe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.