Comments (6)
Hey, @nischalshrestha brought up a quirk with the AST that reminded me of this! Apparently, I made an almost working prototype 2 years ago, so I made some small changes / pushed it to github. No worries if it's not an active area of interest anymore, but wanted to describe some possible directions!
tl;dr
Do you have any thoughts on what the resulting data structure would be?
What do you think about something similar to the srcref approach, where an attribute is attached to each entry in the expression object (that I've been calling AST)? Say an int vector with 5 entries--corresponding getParseData data.frame row number, line start/end, column start/end?
As I understand, the result of parse
already contains a src ref, that gives line and column information, but this is only for entire statements (e.g. 1 + 1; 2 + 2
would have 2 entries for srcref).
It seems like a big advantage here is that tools operating over the AST could identify where they are in the source code (e.g. if they hit an error, or for feedback). For example, identifying all assignments to a certain variable name, or function calls. Right now AFAICT they would need to operate on the parse tree from getParseData()
, but could be totally wrong!
background
https://github.com/machow/straw
(parse tree is on the left, AST on the right)
The implementation is pretty rough, but returns something similar to getParseData right now...
library(straw)
enhance_ast("1 + 1")
# A tibble: 5 x 12
node id parent text token row_num children line1 col1 line2 col2 parse_row_num
<lis> <dbl> <dbl> <chr> <lgl> <int> <list> <int> <int> <int> <int> <int>
1 <exp… 1 NA "" NA 1 <int [1… NA NA NA NA 1
2 <lan… 2 1 "" NA 2 <int [3… 1 1 1 5 2
3 <sym> 3 2 "+" NA 3 <int [0… 1 3 1 3 5
4 <dbl… 4 2 "1" NA 4 <int [0… 1 1 1 1 4
5 <dbl… 5 2 "1" NA 5 <int [0… 1 5 1 5 7
# note, need to add prepend a row to top of this data.frame
# to represent the ASTs top-level "expr" node.
# as is, parse_row_num - 1 from above would match this
getParseData(parse(text = "1 + 1", keep.source = TRUE))
line1 col1 line2 col2 id parent token terminal text
7 1 1 1 5 7 0 expr FALSE
1 1 1 1 1 1 2 NUM_CONST TRUE 1
2 1 1 1 1 2 7 expr FALSE
3 1 3 1 3 3 7 '+' TRUE +
4 1 5 1 5 4 5 NUM_CONST TRUE 1
5 1 5 1 5 5 7 expr FALSE
from lobstr.
This would be really cool! Do you have any thoughts on what the resulting data structure would be?
from lobstr.
I'm still very much interested in this 😄 The problem with using attributes to map the AST to the parse data is that you can't put attributes on symbols, so I think that makes that approach basically impossible.
I wonder if instead it would make sense to just return the data frame and then provide some helper function for recursing over it like you normally would with the AST?
from lobstr.
Ah, shoot--that makes sense!
Part of my interest in picking this back up is looking at how the recursion is usually done in R. I have an implementation of a tree visitor as an R6 class that I can toss into a gist, but have been curious about using something like case
statements to define visitors in a more functional way (based on this post).
Are there good places to look in different libraries to see how recursion / visiting is often done? For example, I know replace_expr() in dbplyr does simple visiting. Really curious what some of the more complex situations out there look like (or whether most the time if simple recursion is all that's needed).
from lobstr.
@machow I usually just whip up a recursive tree-walker by hand, tailored for the specific situation, so I don't have much sense for what you want in a generic implementation.
from lobstr.
Okay--thanks, helpful to hear. Will try out a simple helper function, since it seems like that approach is working for people
from lobstr.
Related Issues (20)
- Release lobstr 1.1.1
- Use the ALTREP inspect method in sxp
- modify ast tree HOT 1
- obj_sizes gives NA for objects larger than 2^31 bytes
- Bad binding access HOT 22
- List R-core as contributor in DESCRIPTION HOT 1
- Request: make `ast()` not automatically unwrap quosures HOT 3
- List all parent environments for a given environment HOT 1
- obj_size() formatting for large objects
- Move `master` branch to `main` HOT 1
- `obj_size` returns different values before and after accessing an element HOT 4
- `format()` method for `lobstr_bytes` HOT 1
- Use consistent type abbreviations
- Release lobstr 1.1.2
- Switch to rlang bytes formatting to eliminate prettyunits dep
- Error for incorrect sxp expand argument does not print list of acceptable values
- Is the output of `ast` wrong for functions with a namespace? HOT 1
- Upkeep for lobstr (2022)
- Reference count seems to be off
- `Error in obj_size_(dots, env, size_node(), size_vector()) : bad binding access` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lobstr.