GithubHelp home page GithubHelp logo

bblfsh / documentation Goto Github PK

View Code? Open in Web Editor NEW
41.0 9.0 30.0 702 KB

Babelfish documentation (GitBook)

Home Page: https://docs.sourced.tech/babelfish

License: Creative Commons Attribution Share Alike 4.0 International

Makefile 1.77% Go 98.23%
babelfish documentation gitbook

documentation's People

Contributors

abeaumont avatar bzz avatar campoy avatar carlosms avatar ceh avatar creachadair avatar dennwc avatar dpordomingo avatar eflanagan0 avatar efx avatar eiso avatar jbeardly avatar jonjonsonjr avatar juanjux avatar kuba-- avatar lwsanty avatar marnovo avatar mcarmonaa avatar mcuadros avatar ncordon avatar smacker avatar smola avatar tsolakoua avatar vmarkovtsev avatar zurk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

documentation's Issues

Text improvement on clients documentation

In the clients guide:

The client API's differ to adapt to their language specific idioms, the following [codes] [shows] several simple examples with the Go, Python and Scala clients that [parsers] a file and [applies] a filter to return all the simple identifiers.

  • I don't know if it's correct in english using the plural codes. Maybe it could be replaced with code snippets for example.
  • shows should be show also
  • parsers -> parse
  • applies -> apply

As I'm not native english speaker, every time I mention something non-trivial please double-check if possible in case I am wrong.

Rename "Language Clients" section

From @ajnavarro yesterday (more):

  • Next documentation point: Language clients
    • a little bit strange name for bblfsh clients

It is true that "Language Clients" might be confused and it could be just "Clients".

What do you think?

Add the info how to change the server's log level

By default, bblfsh server spams with debug logs. When I extract hundreds of files, my terminal explodes. I suggest to add the notice (to FAQ?) how to set it to info, error, etc. - and which log levels are supported at all.

Translation

Hello! I make a living by translating articles, lectures and some other documents English to Turkish or Turkish to English. I would like to help you to translate your project to Turkish. If you download the Licence document and give me a permission to do so, I would be really appreciated to help you. Thank you in advanced...

Stateless vs Non-stateless execution

Previous conversation at https://github.com/src-d/devrel/issues/78.

Seems it's not very clear the separation of stateless and non-stateless (or stateful? ) in the documentation.

I hadn't understood that after the sentence "On macOS, you first need to create etc." is referring to non-stateless mode. In my opinion is a bit confusing the non-clear separation. Maybe a sentence could be added saying something like:

  • For non-stateless execution you can do the following:
    • For MacOS...
    • For Linux...

Change order of "UAST Querying" and "Language Clients" sections

From @ajnavarro yesterday (more):

  • Next documentation point: UAST querying
    • I don't need that at that point, I need a way to connect to bblfsh programatically

"Language Clients" section could go before "UAST Querying", since getting a client for your language of choice and connecting to Babelfish will usually come before querying any UAST.

UAST spec: fixed values for roles

I think we should keep a binary compatibility with roles, ie, even if the list of roles changes in the future the associated numeric number will not.

I propose to change:

const (
    // Invalid Role is assigned as a zero value since protobuf enum definition must start at 0.
    Invalid Role
    SimpleIdentifier
    QualifiedIdentifier
    BinaryExpression
    BinaryExpressionLeft
    BinaryExpressionRight
    BinaryExpressionOp
    /// ...

to:

const (
    // Invalid Role is assigned as a zero value since protobuf enum definition must start at 0.
    Invalid Role = 0
    SimpleIdentifier = 1
    QualifiedIdentifier = 2
    BinaryExpression = 3
    BinaryExpressionLeft = 4
    BinaryExpressionRight = 5
    BinaryExpressionOp = 6
    /// ...

https://godoc.org/github.com/bblfsh/sdk/uast#Role

This is specially important since we will need to provide these roles in the C++ library too.

Add clarifications about the provided positions

Native parsers MUST provide, at least, offset or line+col for positions

to:

Native parsers SHOULD provide, at least, offset or line+col for positions when the native parser gives any positional information for the node

And:

Nodes with defined token SHOULD have (...) when the native parser provides it.

https://doc.bblf.sh/uast/uast_v2.md and https://doc.bblf.sh/uast/representation_v2.md are not rendered correctly.

Just check the links https://doc.bblf.sh/uast/uast_v2.md and https://doc.bblf.sh/uast/representation_v2.md and you see that they are not rendered correctly. You see pure text files. Also, links to these pages do not open. For example the first link on https://doc.bblf.sh/uast/uast-specification.html ( See UASTv2 for the new version.). Or broken link on https://doc.bblf.sh/using-babelfish/advanced-usage.html in 2) Use the latest Go client, set.Mode(Semantic) and change XPath to use the new Semantic UAST types. sentence.

I think it is all about the same problem, so I create only one issue.

Document how to override driver images in server

Once bblfsh/bblfshd#47 is merged an published, documentation should be updated to reflect how to override driver images, something like:

export BBLFSH_DRIVER_IMAGES="python=docker-daemon:bblfsh/python-driver:dev-96b24d3;java=docker-daemon:bblfsh/java-driver:latest"
docker run -e BBLFSH_DRIVER_IMAGES --privileged -p 9432:9432 --name bblfsh bblfsh/server  bblfsh server

macOS: quickstart instructions fail to start bblfshd

Default getting started instructions include stateful bblfshd server, which requires to mount /var/lib/bblfshd to host FS.

Current instructions do not work on macOS and that could confuse new users

docker start bblfshd

Error response from daemon: Mounts denied:
The path /var/lib/bblfshd
is not shared from OS X and is not known to Docker.
You can configure shared paths from Docker -> Preferences... -> File Sharing.
See https://docs.docker.com/docker-for-mac/osxfs/#namespaces for more info.

Error: failed to start containers: bblfshd

To avoid that, one suggestion could be - make quickstart instructions not stateful, but keep a section on how to make it stateful below, marked as (Optional).

The build example does not work

The build example recommends this steps:

# build SDK
go get -u github.com/bblfsh/sdk/...

# build a driver + container
git clone https://github.com/bblfsh/java-driver.git
go get -v -t ./...
make build

If I follow them, I get an error in the last one (make build):

; make build
Makefile:3: *** You must install bblfsh-sdk.  Stop.

I think the problem is that the .sdk directory is missing and the error reporting is misleading.

According to https://github.com/bblfsh/java-driver/blob/master/README.md a bblfsh-sdk pre-build is needed to generate the .sdk directory.

add roadmap section

Add a roadmap section so that people knows what to expect from Babelfish in the near future.

Bad indentation in python-example

https://github.com/bblfsh/documentation/blob/master/using-babelfish/clients.md#python-example

Should be

import bblfsh

from bblfsh import filter as filter_uast

if __name__ == "__main__":
    client = bblfsh.BblfshClient("0.0.0.0:9432")
    response = client.parse("some_file.py")

    if response.status != 0:
        raise Exception('Some error happened: ' + str(response.errors))

    query = "//*[@roleIdentifier and not(@roleQualified)]"
    nodes = filter_uast(response.uast, query)
    for n in nodes:
        print(n)

macOS: "getting started" instructions failing to install drivers

I was following the Getting Started instructions to install bblfshd and the drivers.
First I typed
$ docker run -d --name bblfshd --privileged -p 9432:9432 -v /tmp/bblfshd:/var/lib/bblfshd bblfsh/bblfshd

And everything seemed ok . Response for docker logs bblfshd :

time="2017-10-24T09:02:06Z" level=info msg="bblfshd version: v2.1.0 (build: 2017-10-11T14:17:00+0000)"
time="2017-10-24T09:02:06Z" level=info msg="initializing runtime at /var/lib/bblfshd"
time="2017-10-24T09:02:06Z" level=info msg="control server listening in /var/run/bblfshctl.sock (unix)"
time="2017-10-24T09:02:06Z" level=info msg="server listening in 0.0.0.0:9432 (tcp)"

However, when I typed
$ docker exec -it bblfshd bblfshctl driver install --all

I got as return

Installing python driver language from "docker://bblfsh/python-driver:latest"... Error
Error, mkdir /var/lib/bblfshd/tmp/image773185457: permission denied
Installing java driver language from "docker://bblfsh/java-driver:latest"... Error
Error, mkdir /var/lib/bblfshd/tmp/image255471964: permission denied

I don't know if this relates to the Issue #97 opened by Alex.

Call for Ideas for Summer of Code 2018

Its GSoC 2018 org CFP period and I thought Bblfsh project might want to participate, so why do not we start gathering preliminary project ideas?

Some from the top of my head include i.e adding more drivers for new languages.

What do you guys think?

Point to enry and github/linguist

From @ajnavarro yesterday (more):

  • I have the necessity of know the language of the file before hand to use bblfsh
    • talk about enry on bblfsh documentation to fill that necessity

Somewhere in the documentation we should clarify that when we choose language keys, we use github/linguist as reference (languages.yml) and both enry in Go and linguist in Ruby will do the job if language detection is needed before passing to Babelfish.

Python AST => UAST: Missing UAST Nodes and other things to consider

Python AST => UAST: Missing UAST Nodes and other things to consider

Missing nodes

  • Operators: didn't find any reference on the docs (I'm going by: https://godoc.org/github.com/bblfsh/sdk/uast). The Python AST has nodes for these operators (and others like the Index that I comment as their on items on this list):
	Compare          => (comparators) .ops[list] = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn
	BoolOp           => .boolop = And | Or
	BinOp            => .op = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift | RShift | BitOr |
	                          BitXor | BitAnd | FloorDiv
	UnaryOp          => .unaryop = Invert | Not | UAdd | USub
  • FunctionInvocation, FunctionInvocation, FunctionInvocationName, FunctionInvocationArgumentList, FuncionInvocationArgument, FunctionInvocationArgumentDefaultValue (and MethodInvocationArgumentDefaultValue).

  • FunctionInvocationObject for languages that support emulating method calls with functions like C#, Ruby, D, Nim, etc (3.toString, etc).

  • LambdaExpression, LambdaArguments, LambdaBody

  • BoolLiteral

  • UnicodeString or RawString or ByteString and decide what the default String is

  • Compount literals (with child nodes): ComplexLiteral, TupleLiteral, ListLiteral, SetLiteral, DictLiteral (or HashLiteral), FormattedStringLiteral or ParametrizedStringLiteral.

  • Async / Await or Join and/or the more specific (in Python) AsyncDef, AsyncFor and AsyncWhile.

  • Comprehension (ListComprehension, DictComprehension, SetComprehension).

  • GeneratorExpression, Yield, YieldFrom.

  • IfExpression (a = 3 if condition else 4 in Python or the a = condition? Value: elseValue or C derived languages).

  • ForEach, ForEachIter, ForEachTarget

  • ForElse, WhileElse, DoWhileElse: in Python and other languages loops can have an "else" clause that will be run if the loops reach the end (by iterating over everything in case of the for or the condition being false in for the while) without a break. Could be clearer if we call it “ForComplete, WhileComplete, etc”.

  • TryElse: Python exceptions can have an else "If there is no exception run this". Could be called “TryNoException”.

  • BlockScopeResource, BlockScopeResourceObject, BlockScopeResourceAlias: for representing Python (and other languages) blocks with a resorce that will be freed at the end (with open("file.txt") as f:).

  • Delete

  • Print / Echo is a keyword in a lot of languages.

  • Keyword for other very language-specific keywords with a SimpleIdentifier subnode and optionally argument lists. For example in Python we could use it for: global, nonlocal, exec, eval, ellipsis.

  • ExpandOperator (* in Python to expand a list).

  • Annotation (like argument or assigment annotations in Python).

  • AugmentedAssigment (+=, -+, etc).

  • Unary[Pre|Post]Increase / Unary[Pre|Post]Decrease (a++, --a, etc). Or UnaryOperator with a IncreaseOperator or DecreaseOperator child node (Python doesn’t have this but C-derived languages do).

  • IndexExpression, IndexSliceExpression, IndexCompoundExpression. In Python we have Subscript ([]) that can have an Index child node ([3]) or a Slice child node with lower, upper, step ([3:2:1]) or a ExtSlice that can have any number or Index or Slice childs ([3,2:4,2:7,3…]).

Other stuff

  • Comments and Whitespace: I added them as properties of other "real" nodes (nodes.leading_comments, nodes.sameline_comments). This is the approach that two other FST are using, they could certainly could be converted to nodes with some effort but we should consider if they’re better represented as this (this structure of noops as node properties makes it pretty easy to recover the original code).

  • Python AST nodes that assign or read (so most of them) have a "ctx" or “expression context” field that indicates if the node is written to (“Store”), is read (“Load”), or deleted (“Del”). While it’s interesting it also can be 100% inferred so I omitted that.

Python function/method calls

The Call node in Python is used for both methods and functions, but there are differences that will all us to differentiate normal calls from method calls:

  • The AST for a method has this form on its Call->func node:
"func": {"ast_type": "Attribute",
            "attr": "update",
            "value": {"ast_type": "Name", "id": "retDict"}
           }
  • While normal non method calls have this form:
"func": {"ast_type": "Name",  "id": "export_dict"}

The differences that will allow us to identify one from the other are:

  • The method has the "Attribute" ast_type while the function has simply “Name”.

  • The method has a "value" subnode and Call.func.value.id is the name of the object on which the call is being made.

  • The method name is on the Call.func.attr node whiile on the func is on Call.func.id.

With this it will be easy to identify one or the other with the caveat that non-method functions used with the module name (example: ast.parse()) use the "method" form. This makes sense because in Python modules are (singleton) objects but if the function was imported as “from ast import parse” then it uses the second for. So we could just keep them as MethodCalls or I could add some intelligence to the AST exporter to find if the left side of the dot is an imported module and mark those as normal functions. This could fail in some cases because in Python you can play and modify the import system but those cases should be rare and we could fallback to leave them as Method in that case.

PS: this site is the best description of Python AST nodes:
https://docs.python.org/3/library/ast.html

Define ending position column

There is an inconsistency in java&python drivers. EndPosition.End.Col for java driver returns the position of next character and python driver returns the position of the last character. For example for code print node will have EndPosition.End.Col = 6 for java and EndPosition.End.Col = 5 for python.
Currently, it isn't defined what driver should return, so it depends on native AST.
Position should be defined and later fixed in drivers accordingly.

Add simple comparison table for alternatives

It would be nice to have a page where Babelfish is briefly compared with other similar efforts. Highlighting key differences would help people from other communities get interested in Babelfish as well as provide better understanding of possible solutions "landscape".

Other efforts worth comparing to:

Something like https://github.com/OpenGrok/OpenGrok/wiki/Comparison-with-Similar-Tools or even simpler - a paragraph about each tool would be great.

What would be the best place for such page? If you guys see value in this - I'll be happy to allocate some time and contribute a first draft.

fix ugly mermaid diagrams

SVG diagrams generated from Markdown with mermaid are ugly and look broken. Let's fix them or switch to some nice PNGs.

$ grep -nr --exclude='node_modules/*' mermaid **/*.md
architecture.md:8:```mermaid
architecture.md:25:```mermaid
driver/protocol.md:21:```mermaid
uast/specification.md:105:```mermaid
uast/specification.md:114:```mermaid

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.