GithubHelp home page GithubHelp logo

sdk's People

Contributors

abeaumont avatar alcortesm avatar bzz avatar campoy avatar creachadair avatar dennwc avatar dpordomingo avatar erizocosmico avatar juanjux avatar lwsanty avatar mcarmonaa avatar mcuadros avatar ncordon avatar serabe avatar smacker avatar smola avatar tsolakoua avatar zoidyzoidzoid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sdk's Issues

It's possible to commit something, ignoring changes in managed files

related to #1 Can't commit after generating with bootstrap

steps to reproduce the bug:

  • upgrade the RUNTIME_NATIVE_VERSION in the manifest.toml -> git add -A
  • git commit -m "IT SHOULD FAIL" -> it fails, because a "managed file changed"
  • bblfsh-sdk update -> it will generate a new README.md with the new manifest.toml specs
  • git commit -m "IT SHOULD FAIL" -> it is COMMITTED!!! and it shouldn't, because you didn't added the README.md changes. (the new commit is broke, as you can see if you git co . and then bblfsh-sdk update --dry-run

possible solution:

The pre-commit must validate ONLY the staging area instead of the current working copy.
To do it, the process is:

  1. stash everything not being in the staging area,
  2. perform any validations
  3. recover the stash done by (1)
  4. exit with the exit code of the validations done by (2)

make fails on a uninitialized git repository or repo with no commit

make fails on a uninitialized git repository or repo with no commit. It should initialize it or print a meaningful error.

➜  mylang-driver git:(master) ✗ make all
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Makefile:6: warning: overriding recipe for target 'test-native'
/home/smola/dev/demos/demo-2017-03-03/1_sdk/mylang-driver/.sdk/make/rules.mk:74: warning: ignoring old recipe for target 'test-native'
Makefile:10: warning: overriding recipe for target 'build-native'
/home/smola/dev/demos/demo-2017-03-03/1_sdk/mylang-driver/.sdk/make/rules.mk:85: warning: ignoring old recipe for target 'build-native'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
+ mkdir -p /home/smola/dev/demos/demo-2017-03-03/1_sdk/mylang-driver/build
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
/bin/sh: 1: eval: cannot open /home/smola/dev/demos/demo-2017-03-03/1_sdk/mylang-driver/Dockerfile.build.tpl: No such file
/home/smola/dev/demos/demo-2017-03-03/1_sdk/mylang-driver/.sdk/make/rules.mk:67: recipe for target 'bblfsh/mylang-driver-build' failed
make: *** [bblfsh/mylang-driver-build] Error 2

Split ToNode into separate transformations (BIP-5)

Use new DSL to port old ObjectToNode transformations.

Requires: #241

TODO:

  • InternalTypeKey
  • OffsetKey, LineKey, ColumnKey
  • EndOffsetKey, EndLineKey, EndColumnKey
  • TokenKeys
  • SpecificTokenKeys
  • SyntheticTokens
  • PromotedPropertyLists
  • PromoteAllPropertyLists
  • PromotedPropertyStrings
  • TopLevelIsRootNode
  • drop OnToNode
  • drop Modifier
  • drop IsNode

Build info on driver binaries is empty

$ docker run --rm bblfsh/python-driver:latest /opt/driver/bin/driver --help
Usage:
  /opt/driver/bin/driver [OPTIONS] <command>

Help Options:
  -h, --help  Show this help message

Available commands:
  parse-native
  parse-uast
  serve
  tokenize

Build information
  commit: 
  date:

Clarify error statuses returned by native drivers

Babelfish drivers may return 3 main statuses: OK, Error, Fatal.

Currently, there is no guarantee that Error responses will have a partial AST, since it's not tested in any of the drivers.

At the same time, Error status is not considered an error/exception in clients as reported in bblfsh/java-driver#77, thus clients might not be aware that they receive a partial AST.

We should either:
a) Make a special type of error/exception that user code should assert to accept a partial AST. In this case clients will receive an error in case of any syntax erros, which might be a desired behavior.
b) Mention current behavior in Babelfish docs, clarify that user should check the status.

Personally, I would prefer the first option.

uast: merge roles for Assignment and AugmentedAssignment

Currently, we have:

	// Assignment represents a variable assignment or binding.
	// The variable that is being assigned to is annotated with the
	// AssignmentVariable role, while the value is annotated with
	// AssignmentValue.
	Assignment
	AssignmentVariable
	AssignmentValue

	// AugmentedAssignment is an augmented assignment usually combining the equal operator with
	// another one (e. g. +=, -=, *=, etc). It is expected that children contains an
	// AugmentedAssignmentOperator with a child or aditional role for the specific Bitwise or
	// Arithmetic operator used. The AugmentedAssignmentVariable and AugmentedAssignmentValue roles
	// have the same meaning than in Assignment.
	AugmentedAssignment
	AugmentedAssignmentOperator
	AugmentedAssignmentVariable
	AugmentedAssignmentValue

This feels quite redundant. I wonder if we can come up with a more succint way of annotating augmented assignments.

support nested keys in ObjectToNoder

Take as an example the AST generated by babylon:

{
  "type": "SomeType",
  "loc": {
    "start": { "column": 1, "line": 1 },
    "end": { "column": 2, "line": 4 },
  },
  "offset": 1,
  "endOffset": 34
}

It'd be nice to be able to put as LineKey and so on loc.start.line instead of having to transform the AST.

Function arguments sorted in the wrong order

Doing the integration tests for the Python driver I noticed that the function arguments list, whose items has the internalRole "args" and Role MethodInvocationArguments (childrens of a MethodInvocation) are not sorting in the right order.

For this code:

print("something1", 42, somesymbbol)

I will attach the complete jsons for the Python AST and the generated UAST to this issue

unsorted_args_jsons.zip

but to see it clearly a (simplified) version of the Python AST part of the arguments is:

                  "args" : [
                     {
                        "LiteralValue" : "something1",
                        "ast_type" : "StringLiteral"
                     },
                     {
                        "ast_type" : "NumLiteral",
                        "NumType" : "int",
                        "LiteralValue" : 42,
                     },
                     {
                        "id" : "somesymbbol",
                        "ast_type" : "Name"
                     }
                  ],

While the (simplified) UAST generated is:

                     {
                        "Properties" : {
                           "NumType" : "0",
                           "internalRole" : "args"
                        },
                        "Roles" : [ 59, 54 ],
                        "Token" : "42",
                        "InternalType" : "NumLiteral"
                     },
                     {
                        "InternalType" : "Name",
                        "Token" : "somesymbbol",
                        "Roles" : [ 54, 0 ],
                        "Properties" : {
                           "internalRole" : "args",
                        },
                     },
                     {
                        "InternalType" : "StringLiteral",
                        "Properties" : {
                           "internalRole" : "args"
                        },
                        "Roles" : [ 58, 54 ],
                        "Token" : "something1",
                     }

As you can see, the arguments are not in the same order for the UAST. The Rule I'm using for them is:

		On(HasInternalType(pyast.Call)).Roles(MethodInvocation).Children(
			On(HasInternalRole("args")).Roles(MethodInvocationArgument),
			On(HasInternalRole("func")).Self(On(HasInternalRole("id"))).Roles(MethodInvocationName),
			On(HasInternalRole("func")).Self(On(HasInternalRole("attr"))).Roles(MethodInvocationName),
			On(HasInternalRole("func")).Self(On(HasInternalType(pyast.Attribute))).Children(
				On(HasInternalRole("id")).Roles(MethodInvocationObject),
			),
		),

Bootstraping a driver

Some things that I found missing (it could be maybe a lack of documentation)

Driver

Inside the driver, after running bblfsh-sdk prepare-build

  • I think it should be autogenerated:
    • native directory if it does not exists,
    • .gitignore with a .sdk rule; if .gitignore exists without that rule, then append it.
  • It should be great a hint in the root Makefile when running make if the SDK has not been yet installing, just saying something like:
    "It is needed to instal the bblfsh SDK before; it can be done with: bblfsh-sdk prepare-build"
  • It should be explained somewhere that to start working with a driver, it is needed to run, in order:
bblfsh-sdk prepare-build
bblfsh-sdk init

SDK

Inside bblfsh/sdk project

  • make test does not pass
> make test
--- FAIL: github.com/bblfsh/sdk/etc/skeleton/driver/normalizer :: TestNativeBinary
        Error Trace:    normalizer_test.go:18
	Error:		Expected nil, but got: &os.PathError{Op:"fork/exec", Path:"/opt/driver/src/build/native", Err:0x2}
  • is it needed to run make all before doing anything? anyway tests keeps failing

uast: define async/await

Some languages (Jotlin, Python, Nim, etc) use async to qualify blocks (which are not always functions, sometimes they can be scoped blocks) that could run as interruptible coroutines. Usually those languages also have an await keyword for waiting for the completion of those blocks (similar to the join call/keyword for threads).

For functions this is usually done on the definition, with Go and Erlang instead having a similar keyword that is used on the call (we could have AsyncDefinition and AsyncCall for example).

We need to check several languages with built-in coroutine features and decide how to implement those roles in the AST.

Return the language when using "autodetect" feature

When it is used "autodetect" feature, passing the filename and content in the request, it would be great to obtain the language as part of the response.

It could be something like:

message ParseResponse {
	option (gogoproto.goproto_getters) = false;
	option (gogoproto.goproto_stringer) = false;
	option (gogoproto.typedecl) = false;
	gopkg.in.bblfsh.sdk.v1.protocol.Status status = 1;
	repeated string errors = 2;
	google.protobuf.Duration elapsed = 3 [(gogoproto.nullable) = false, (gogoproto.stdduration) = true];
	gopkg.in.bblfsh.sdk.v1.uast.Node uast = 4 [(gogoproto.customname) = "UAST"];
	string language = 5;
}

instead of the current ParseResponse:

message ParseResponse {
	option (gogoproto.goproto_getters) = false;
	option (gogoproto.goproto_stringer) = false;
	option (gogoproto.typedecl) = false;
	gopkg.in.bblfsh.sdk.v1.protocol.Status status = 1;
	repeated string errors = 2;
	google.protobuf.Duration elapsed = 3 [(gogoproto.nullable) = false, (gogoproto.stdduration) = true];
	gopkg.in.bblfsh.sdk.v1.uast.Node uast = 4 [(gogoproto.customname) = "UAST"];
}

Running container fails on GOPATH \w multiple dirs entries

Default GOPATH on GCE cloud console is "/home/alex/gopath:/google/gopath"

But running a build docker container for any driver fails on mounting such GOPATH

+ docker run --rm -t -u bblfsh:1000 -v /home/alex/java-driver:/opt/driver/src/ -v /home/alex/gopath:/google/gopath:/go -e ENVIRONMENT=bblfs
h/java-driver-build-with-go bblfsh/java-driver-build-with-go make test-driver-internal

docker: Error response from daemon: Invalid bind mount spec "/home/alex/gopath:/google/gopath:/go": invalid mode: /go.
See 'docker run --help'.

Static compilation of sdk doesn't work

> make install
CGO_ENABLED=0 go get -t -v -ldflags '-extldflags "-static"' ./...
net
go install net: open /usr/lib/go/pkg/linux_amd64/net.a: permission denied
make: *** [Makefile:32: install] Error 1

This prevents drivers from building:

> make build
+ docker build -q -t bblfsh/bash-driver-build -f /home/abeaumont/go/src/github.com/bblfsh/bash-driver/.sdk/tmp/tmp.1510160514-853298827 .
+ docker run --rm -t -u bblfsh:1000 -v /home/abeaumont/go/src/github.com/bblfsh/bash-driver:/opt/driver/src/ -e ENVIRONMENT=bblfsh/bash-driver-build -e HOST_PLATFORM=Linux bblfsh/bash-driver-build make build-native-internal
+ docker build -q -t bblfsh/bash-driver-build-with-go -f /home/abeaumont/go/src/github.com/bblfsh/bash-driver/.sdk/tmp/tmp.1510160520-277170896 .
+ docker run --rm -t -u bblfsh:1000 -v /home/abeaumont/go/src/github.com/bblfsh/bash-driver:/opt/driver/src/ -v /home/abeaumont/go:/go -e ENVIRONMENT=bblfsh/bash-driver-build-with-go -e HOST_PLATFORM=Linux bblfsh/bash-driver-build-with-go make build-driver-internal
/bin/sh: /go/bin/bblfsh-sdk-tools: not found

Implement the custom parser for Jupyter notebooks

This is to record the demand of data scientists to have *.ipynb parsed. There are quite a few already. I guess we should merge all the code cells and interpret them as a Python script.

@juanjux Do you think it should be added to Python driver? It is going to reuse 100% of the code.

annotations don't handle mutliple roles properly

When an annotation adds more than one role, only the first one is extracted in the annotation documentation. For example, the following annotations:

On(jdt.EnhancedForStatement).Roles(ForEach, Statement).Children(
	On(jdt.PropertyParameter).Roles(ForInit, ForUpdate),
	On(jdt.PropertyExpression).Roles(ForExpression),
	On(jdt.PropertyBody).Roles(ForBody),
),

generates the following annotation documentation:

| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\] | ForEach |
| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\]/\*\[@internalRole\]\[@internalRole='parameter'\] | ForInit |
| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\]/\*\[@internalRole\]\[@internalRole='expression'\] | ForExpression |
| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\]/\*\[@internalRole\]\[@internalRole='body'\] | ForBody |

while it should generate:

| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\] | ForEach, Statement |
| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\]/\*\[@internalRole\]\[@internalRole='parameter'\] | ForInit, ForUpdate |
| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\]/\*\[@internalRole\]\[@internalRole='expression'\] | ForExpression |
| /self::\*\[@InternalType='CompilationUnit'\]//\*\[@InternalType='EnhancedForStatement'\]/\*\[@internalRole\]\[@internalRole='body'\] | ForBody |

Fill the Response.Language when the Request.Language was.

Currently, the server fills the Response.Language, either with the Request.Language if it was specified, or with the autodetected language if it wasn't. But when running against a driver container directly, the driver must at least do the first part. This should be implemented in the generic (golang) part of the drivers.

Remove duplicated roles in the nodes

With the new agglutinative UAST it'll be hard to avoid some duplicated roles. They shouldn't have any effect, but they are un-classy and annoy some users.

It should be pretty easy to "uniq" the roles of a node in the SDK.

Outdated drivers are treated as up-to-date on SDK updates

Currently drivers manifests has no mentions of version of SDK they were written for. It leads to a situation when driver might become outdated and incompatible, while still being listed as beta or even stable.

I propose to include new (required) field to manifest with semantic version of SDK. This field will be overwritten by bblfsh-sdk update and will be considered when interpreting driver status.

As always, minor version difference will not affect driver status, but major difference will automatically drop the status to inactive.

Reconsider the expected precision of positioning

As @juanjux commented in slack:

We probably should relax the specifications of the positions to just provide what the native AST provides or we won't make a new driver per year. Future versions could then add optionally tokenizing information in the nodes.

Port annotation DSL predicates (BIP-5)

Port old annotation DSL predicates to a new transformation DSL.

Requires: #241

TODO:

  • Self
  • Children(HasInternalRole)
  • Roles
  • HasInternalType
  • HasProperty
  • HasToken
  • And
  • Not
  • drop Children
  • drop Descendants, DescendantsOrSelf
  • drop HasInternalRole
  • drop HasChild
  • drop Or

Expose the listing drivers endpoint for clients

As required by bblfsh/scala-client#68 and bblfsh/web#89, it would be needed to let the clients to fetch the list of installed drivers.

That API is part of the bblfshd/daemon/protocol, and it's currently only exported throwgh a unix socket for admin purposes so it can not be reached from the current clients.

Is there any plan to move the diver listing methods to the user network?

Here is the list of clients needing this:

uast: define nonlocal access *usage* modifiers

These are modifiers that in general give an enclosed scope access to a variable in an outside scope. This could mean read/write or just read access.

This concept is similar to the existing "VisibleFrom" rules already on the AST but at the point of usage instead of declaration.

For example in Python we have "global" to allow an enclosed scope (usually a function definition) to modify a globally defined variable. There is also "nonlocal" for closures to be able to write variables of the enclosing scope; this concept (for closures) is usually called "captures" and it's also in C++11 and other languages supporting closures (in C++11 the capture is needed even for read access, in Python the read access is automatic and its used for obtaining write access).

We should investigate what other languages have for these access usage modifiers before defining how we'll do this (for example, are languages also supporting module/namespace/other access modifiers?

Odd issue with driver images fixed by reinstalling all docker images

Found while running the fixtures with a just build python-driver image and running the fixture regeneration: it didn't find the python 2 binary.

After deleting the docker image, I rebuild it and then when trying to install it into bblfshd it said:

"Installing python language driver from "bblfsh/python-driver:dev-21b075d-dirty"... Error, manifest unknown: manifest unknown".

I deleted everything (docker rmi (docker images -q) -f; and docker rm (docker ps -a -q) -f), redownloaded bblfshd's image and then it worked.

We should investigate this if it happens again.

Use an agglutinative language for Roles

This would make the Roles language more expressive and flexible, remove complexity, make it more extensive and better suited for language analysis.

There are various tasks that need to be done for this:

More specific Parser.Tokenkeys or token-extracting annotations

Currently all the fields added to Parser.Tokenkeys are evaluated for token extraction for all the nodes. This has the problem that some nodes could have both of the fields, being only one of them the token field, but producing an error.

Two solutions could be used:

  1. Change the data structure to optionally allow to specify an "internal_type" where the token extraction will be done for every specified token-field.

  2. Add a new annotation that could extract a field as the token, e. g.:

// Node of internal type "Something" as the token in the "value" field:
On(HasInternalType(pyast.Something)).Roles(SomeRoles).TokenFrom("value")

Update Go version to 1.9

This update will allow us to use type aliases, that can be used to map driver.Status to protocol.Status (and other similar cases).

Some children are randomly sorted on the pretty print

(I'll take a look at this, adding the issue to add a note on the sprint task for prefix/infix/postfix that depends on this being fixed).

How to reproduce:

  • On a driver with integration tests (for example Python or Java), delete tests/.uast and tests/.native.
  • make integration-test so these files are regenerated again.
  • git commit adding the new files.
  • Repeat the steps (delete, regenerate) without making any other changes.
  • git diff

Example commit created just for debugging this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.