GithubHelp home page GithubHelp logo

Comments (19)

juanjux avatar juanjux commented on July 29, 2024

The token of a function is the name, but the start and ending positions are the ones of the function node itself with body and all, not the token and obviously could overlap (if a node is inside a function it will overlap with it). This is the same for all nodes. The current UAST specification doesn't give a position for the tokens except when the token match the length of the node (variable names, etc.) so that part is a wontfix.

I'll take a look at the decorators, the native AST gives them the same position as the functions but maybe they can be fixed with the tokenizer.

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024

but at least start position should be the same as in code - don't you think so?

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

No, because in def somefunc or class SomeClass the start position is the position of def or class, not of somefunc.

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024

so it will be position of def, class, self?
what about a.b? is the position of b == position of a?

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

The code a.b produces this native AST. As you can see the col_offsets are different (1 and 3):

{'ast_type': 'Module',
 'body': [{'ast_type': 'Expr',
           'col_offset': 1,
           'end_col_offset': 3,
           'end_lineno': 1,
           'lineno': 1,
           'value': {'ast_type': 'Attribute',
                     'attr': 'b',
                     'col_offset': 3,
                     'ctx': 'Load',
                     'end_col_offset': 3,
                     'end_lineno': 1,
                     'lineno': 1,
                     'value': {'ast_type': 'Name',
                               'col_offset': 1,
                               'ctx': 'Load',
                               'end_col_offset': 1,
                               'end_lineno': 1,
                               'id': 'a',
                               'lineno': 1}}}],
 'col_offset': 1,
 'end_col_offset': 3,
 'end_lineno': 1,
 'lineno': 1}

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024

can you explain about line number, col number, col offset, end col offset? Do we have it somewhere in bblfsh docs?

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

This is an example of the native AST, you don't have to worry about it, it would be translated to StartPosition and EndPosition in the UAST (with some corrections).

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024
for key, val in some_dict.items():
...

key & val has the same positions according to image.
Is it because of start position right after for?

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024

In Repo2nBOW class definition:

MODEL_CLASS = NBOW

and it was converted to

MODEL_CLANBOW0m = NBOW

0m from color definition WHITE = "\033[0m", so it looks like that in this case position somehow is shifted.

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024

btw - what is the right way to estimate a position of the token in code?

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

Code:

class C():
    MODEL_CLANBOW = NBOW

Gives me an UAST with the columns 5 and 17 for MODEL_CLANBOW which is Vim is not lying to me are correct.

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

Code:

for key, val in some_dict.items(): pass

Gives me this (reduced and sorted) UAST:

                     {
                        "Token" : "key",
                        "StartPosition" : {
                           "Line" : 1,
                           "Col" : 5,
                           "Offset" : 4
                        },
                        "EndPosition" : {
                           "Line" : 1,
                           "Col" : 7,
                           "Offset" : 6
                        }
                     },
                     {
                        "Token" : "val",
                        "StartPosition" : {
                           "Line" : 1,
                           "Col" : 10,
                           "Offset" : 9
                        },
                        "EndPosition" : {
                           "Line" : 1,
                           "Col" : 12,
                           "Offset" : 11,
                        },
                     }

Please check the UAST produced.

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024

So initial code:

class Repo2nBOW(Repo2Base):
    """
    Implements the step repository -> :class:`ast2vec.nbow.NBOW`.
    """
    MODEL_CLASS = NBOW

and after processing:

class Repo2nBOW(Repo2Base):!Wrong length of token 'Repo2nBOW' (expected 22 - 7 + 1 == 9)!!token 'Repo2nBOW' start & end line: 14, 18!
    """
    Implements the step repository -> :class:`ast2vec.nbow.NBOW`.
    """
    MODEL_CLANBOW0m = NBOW

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

@EgorBu I don't know where that 0m comes from but in my test the positions in the UAST are correct - check it on your own directly without the uast_playground and tell me if you have different positions in the UAST.

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

I've tested that the decorators are correct in their line numbers. It's the FunctionDef parent object that starts in the previous line... but this is correct at least with regard to the native Python AST since the decorators are part of the function AST, so for the language the function starts where the decorator (instead of in the def where undecorated functions start).

from python-driver.

EgorBu avatar EgorBu commented on July 29, 2024

It will be really nice to mention it in the documentation.
Because these details are not obvious and can make a lot of people confused.

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

If you check the UAST (or the native AST), the decorator is a child of the function, and since is an AST parser and not a tokenizer it's logical that the start and end locations go from the first subnode node to the last.

Still, a list of things to consideer and caveats for each driver is something that could be good to have (we've already spoken about this in the last meeting but we still haven't had time for it).

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

Reopening since the documentation clearly says that the StartPosition of a node with a Token is the start position of the Token.

from python-driver.

juanjux avatar juanjux commented on July 29, 2024

Should be fixed in v0.7.1 (update the client-python too), thanks for the report!

from python-driver.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.