Hello there, I recently noticed this new parser project. It is q

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Potential usage in TruffleRuby about natalie_parser HOT 4 CLOSED

eregon commented on June 28, 2024 4

Potential usage in TruffleRuby

from natalie_parser.

Comments (4)

kddnewton commented on June 28, 2024 1

Hey @eregon - I've got a proposal document that's coming to you next week on this actually 😅.

That being said, you should 100% investigate this project as a potential solution, as it's brilliant.

from natalie_parser.

seven1m commented on June 28, 2024

@eregon Thanks for the interest! I should say that my experience writing parsers prior to this was pretty close to nil, and I know there are still many bugs in this project, several of which I am finding and fixing almost daily as I work on my own compiler project.

So with that bit of warning out of the way, if you do feel it would be useful for TruffleRuby, then I will be happy to help however I can! And yes, I do intend to make it feature-complete for Ruby 3.0 syntax, and eventually 3.1 and beyond.

On the other hand, I have a feeling that @kddnewton's parser will kick ass, and I hope he can do an even better job than I have -- surely he can at least learn from my mistakes. 😄 (I'm happy to elaborate on those mistakes in a longer thread if anyone cares to hear them.)

from natalie_parser.

eregon commented on June 28, 2024

I'm happy to elaborate on those mistakes in a longer thread if anyone cares to hear them

I think that would be quite interesting :) There are many considerations for a Ruby parser.
I'll try to prototype something to use this parser and see how it feels like.

I am curious, how long would you estimate it took to get to the current state of this parser?

Ideally CRuby would use a parser that other parser users can also use, that would be the best guarantee of compatibility and staying up-to-date. I think @kddnewton's parser has the potential for that (being in C) but unknown if CRuby would be open to change their parser.

from natalie_parser.

seven1m commented on June 28, 2024

I'm happy to elaborate on those mistakes in a longer thread if anyone cares to hear them

I think that would be quite interesting

Using SharedPtr everywhere made development easy, but I suspect it hurts performance, since it is a reference-counted pointer. I wish I could have used OwnedPtr (or, unique_ptr in C++ parlance), but cleaning up after syntax errors caused that to be too hard for me. A smarter C++ programmer probably could have made OwnedPtr work.
The code is very procedural. I think, knowing what I know now, I might have opted to identify and extract more objects as I was going -- I just didn't realize how many nested if-statements and spaghetti code was going to result. (But really, who ever does?!)
I'm not sure if it was a "mistake" or not (as it was totally intended and I don't want to change it), but I made my own data structures in the TM namespace rather than use the C++ STL data structures. This might be a big mistake, as there is potential for memory errors there (though I make heavy use of AddressSanitizer to hopefully catch those). It's just that I have special requirements for Natalie that compilation be as fast as possible -- using the STL slows things down quite a bit with gcc/clang. Also, I have a foggy notion that Natalie and NatalieParser can run on embedded hardware someday, so less reliance on the C++ stdlib is better in that scenario.
Targeting RubyParser output vs whitequark/parser might have been a mistake, seeing as RubyParser seems to be less popular in the Ruby community. I have found quite a few confusing and/or buggy edge cases while comparing NatalieParser to RubyParser, and the pattern matching stuff in RubyParser is incomplete. I don't remember why I chose RubyParser back when I started a couple years ago, but I think it was mostly out of ignorance. This mistake is probably very fixable -- we can change the node output to match whitequark/parser instead, but it's not a trivial amount of work.
Other mistakes I don't even know about yet! 😆

I am curious, how long would you estimate it took to get to the current state of this parser?

That's a hard one. But I'll take a stab at an estimate:

→ cd natalie
→ git log --oneline | egrep "lex|parse" | wc -l
47
→ cd natalie_parser
→ git log --oneline | wc -l
316

So that's about 363 commits, and assuming each commit took me between 30 minutes to one hour, I'd estimate it to be somewhere between 180 to 360 hours of work. I feel like the actual number is probably on the upper end of that estimate, because I know from January 2022 to now, I've worked on it almost every day for 1-2 hours, so 360 seems about right.

But that being said, I am a complete amateur and I had to rework things many times. Someone more experienced could almost certainly do this sort of thing faster, esp. if they were doing it as a day job instead of hurriedly before work or in their spare time. 😉

Ideally CRuby would use a parser that other parser users can also use, that would be the best guarantee of compatibility and staying up-to-date.

😍

from natalie_parser.

Potential usage in TruffleRuby about natalie_parser HOT 4 CLOSED

Comments (4)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs