scala-native / scala-native-bindgen Goto Github PK
View Code? Open in Web Editor NEWScala Native Binding Generator
Home Page: https://scala-native.github.io/scala-native-bindgen/
License: BSD 3-Clause "New" or "Revised" License
Scala Native Binding Generator
Home Page: https://scala-native.github.io/scala-native-bindgen/
License: BSD 3-Clause "New" or "Revised" License
Currently all bindings are stored in couple of global std::string
variables that are updated by TreeVisitor
It would be better to store bindings in some intermediate representation, so it would be possible to rearrange the code, make it more consistent and possibly filter unused declarations.
struct b;
struct c;
struct a {
struct b *bb;
};
struct b {
struct c *cc;
};
struct c {
struct a *aa;
};
Produces following Scala code without any warnings:
type struct_b = native.CStruct1[native.Ptr[struct_c]]
type struct_a = native.CStruct1[native.Ptr[struct_b]]
type struct_c = native.CStruct1[native.Ptr[struct_a]]
Problem is in CycleDetection::isCyclic
it checks only if a type references itself.
Often C standard headers allow to specify which specification to conform to via defines. For example _POSIX_C_SOURCE
. With the these defines can be passed with -extra-arg
:
./scalaBindgen /usr/include/ctype.h -name ctype \
-extra-arg=-D_POSIX_C_SOURCE \
-extra-arg=-D_DONT_USE_CTYPE_INLINE_
The goal of this ticket is to think about how could provide a way to target a specific specification across different platforms.
Currently this struct:
struct MyStruct {
private:
int field1;
int filed2;
};
produces this implicit helper class:
implicit class struct_MyStruct_ops(val p: native.Ptr[struct_MyStruct]) extends AnyVal {
def `private`: native.CInt = !p._1 // incorrect name
def `private_`=(value: native.CInt):Unit = !p._1 = value
def filed2: native.CInt = !p._2
def filed2_=(value: native.CInt):Unit = !p._2 = value
}
First field has wrong name (private
instead of field1
)
I suppose that implicit class should not contain getters and setters for private fields
This is still a draft, will fill out with more details in the coming days.
Experience from sbt-graphql suggests that code generation is much more robust if restricted to a single root object
.
Example:
@native.link("bindgentests")
@native.extern
object bzip2 {
type enum_days = native.CUnsignedInt
object enum_days {
final val MONDAY: enum_days = 0.toUInt
final val TUESDAY: enum_days = 200.toUInt
final val WEDNESDAY: enum_days = 201.toUInt
final val THURSDAY: enum_days = 4.toUInt
final val FRIDAY: enum_days = 5.toUInt
final val SATURDAY: enum_days = 3.toUInt
final val SUNDAY: enum_days = 4.toUInt
}
type struct_point = native.CStruct2[native.CInt, native.CInt]
object struct_point {
def apply(x: native.CInt, y: native.CInt)(implicit z: native.Zone): native.Ptr[struct_point] = {
val ptr = native.alloc[struct_point]
if (x != 0)
!ptr._1 = x
ptr
}
}
def getPoint(): native.Ptr[struct_point] = native.extern
def get_WEDNESDAY(): enum_days = native.extern
object implicits {
implicit class struct_point_ops(val p: native.Ptr[struct_point]) extends AnyVal {
def x: native.CInt = !p._1
def x_=(value: native.CInt):Unit = !p._1 = value
def y: native.CInt = !p._2
def y_=(value: native.CInt):Unit = !p._2 = value
}
}
}
Usage:
val a: bzip2.enum_days = bzip2.enum_days.MONDAY
import bzip2.implicits._
Zone { implicit zone =>
val point = bzip2.struct_point(0, 1)
assert(point.x == 0)
point.x = 42
assert(point.x == 42)
}
I'm not sure if the warning is intended as a test, however, in that case we might want to have a way to capture and ensure that the warning is reported.
[info] BindgenSpec:
[info] Bindgen
[info] - should exist
[info] - should generate bindings for native.h
[info] - should generate bindings for PrivateMembers.h
[info] - should generate bindings for Enum.h
[info] - should generate bindings for Struct.h
[info] - should generate bindings for NativeTypes.h
Warning: integer value does not fit into 8 bytes: 18446744073709551615
Warning: integer value does not fit into 8 bytes: 9223372036854775809
[info] - should generate bindings for LiteralDefine.h
[info] - should generate bindings for ReservedWords.h
[info] - should generate bindings for Function.h
[info] - should generate bindings for Union.h
[info] - should generate bindings for Typedef.h
This code:
struct myStruct {
struct {
int a;
} innerStruct;
};
Generates following type:
type struct_myStruct = native.CStruct1[native.CArray[Byte, native.Nat.Digit[native.Nat._3, native.Nat._2]]]
So inner anonymous struct is represented as an array of bytes.
It should be possible to generate type for the inner struct:
type struct_myStruct_anonymous0 = native.CStruct1[native.CInt]
type struct_myStruct = native.CStruct1[struct_myStruct_anonymous0]
Currently value for enums is calculated using a counter, but some enums may have explicit value, so counter may assign wrong numbers to them.
This should solve the problem:
en->getInitVal()->getLimitedValue()
Currently bindgen outputs methods with the same names as in native library.
It would be good to have a way to transform names according to Scala naming conventions
// before
def my_function(arg: native.CInt): native.CInt = native.extern
// after
@name("my_function")
def myFunction(arg: native.CInt): native.CInt = native.extern
Ideally we should also enable -Werror
and fail the CI build if any warnings creep in.
Generator outputs lots of duplicate warnings:
Warning: Const qualifier not supported
Warning: Const qualifier not supported
Warning: Const qualifier not supported
....
Right now const
is discarded, however, it should be possible to support its use in certain places, namely:
extern const int PI;
struct point { int x; int y; };
const struct point *get_cursor(void);
typedef bool (*visitor)(const struct point *point);
For variables (#70), the Scala definition should use val
and for structs, one idea is to generate a separate type alias for const
structs with an ops
implicit class that does not contain the field_=
methods. It would be nice to come up with an encoding that allows non-const
structs to be assigned to a const
version and disallow the inverse.
It would be great to detect leaks and problematic memory access by running CI tests with valgrind.
Consider following structure with const array:
struct structWithArray {
const int arr[5];
};
Bindgen generates following type and helper methods:
type struct_structWithArray = native.CStruct1[native.CArray[native.CInt, native.Nat._5]]
// ...
implicit class struct_structWithArray_ops(val p: native.Ptr[struct_structWithArray]) extends AnyVal {
def arr: native.CArray[native.CInt, native.Nat._5] = !p._1
def arr_=(value: native.CArray[native.CInt, native.Nat._5]):Unit = !p._1 = value
}
def struct_structWithArray()(implicit z: native.Zone): native.Ptr[struct_structWithArray] = native.alloc[struct_structWithArray]
Lets create instance of struct_structWithArray
and array of type int
package org.scalanative.bindgen.samples
import utest._
import scala.scalanative.native._
import org.scalanative.bindgen.samples.StructHelpers._
// ...
val structWithArray = struct_structWithArray()
val array = alloc[CArray[CInt, Nat._5]]
Following code works fine:
!structWithArray._1 = !array
But code that uses helper method fails:
structWithArray.arr_=(!array) // fails
[error] scala.scalanative.util.UnsupportedException: can't cast from Array(Int,5) to Array(Class(Top(java.lang.Object)),0)
[error] at scala.scalanative.util.package$.unsupported(package.scala:20)
[error] at scala.scalanative.optimizer.pass.AsLowering.onInst(AsLowering.scala:15)
[error] at scala.scalanative.optimizer.Pass.$anonfun$onInsts$1(Pass.scala:44)
[error] at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
[error] at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59)
[error] at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:52)
[error] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
[error] at scala.collection.TraversableLike.map(TraversableLike.scala:234)
[error] at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
[error] at scala.collection.AbstractTraversable.map(Traversable.scala:104)
[error] at scala.scalanative.optimizer.Pass.onInsts(Pass.scala:44)
[error] at scala.scalanative.optimizer.Pass.onInsts$(Pass.scala:43)
[error] at scala.scalanative.optimizer.pass.AsLowering.onInsts(AsLowering.scala:9)
[error] at scala.scalanative.optimizer.Pass.onDefn(Pass.scala:32)
Seems like it is related to scala-native/scala-native#555
because if we simply create a functions that accepts an array it will also fail with the same error:
def useArray(value: CArray[CInt, Nat._5]): Unit = {
// fails
}
I suggest to change helper methods for CArray
or CStruct
fields such that they accept/return pointers
extern int __a;
void __private();
import scala.scalanative._
import scala.scalanative.native._
@native.link("mylib")
@native.extern
object mylib {
val __a: native.CInt = native.extern
}
Such structs are represented as native.CArray
Currently no helpers are generated for them
Right now the generated code has these three imports:
import scala.scalanative._
import scala.scalanative.native._
import scala.scalanative.native.Nat._
This may lead to issues if a binding uses the name native
or one of the natural number symbols like _1
. In order to avoid such conflicts we need to be more explicit in the generated code, like prefixing everything with scalanative.native.CInt
.
Alternatively we need to document limitations of what names are not allowed.
Certain symbols in the standard C library is exposed under the expected names using a #define
rather than an extern
global variable, e.g. stdin
from stdio.h
:
extern FILE *__stdinp;
// ...
#define stdin __stdinp
The way this is handled in Scala Native is to generate a C function which simply returns the expected symbol, e.g.:
void *scalanative_libc_stdin() { return stdin; }
which is then bound using
@extern
object stdio {
// ...
@name("scalanative_libc_stdin")
def stdin: Ptr[FILE] = extern
}
The questions is should the binding generator handle this case by generating C code or should it generate per platform bindings that use the final symbol, e.g.:
@extern
object stdio {
// ...
@name("__stdinp")
def stdin: Ptr[FILE] = extern
}
Currently, a recursive struct causes an error message and does not get generated but other code generates which is good. Example error follows:
Error: struct___darwin_pthread_handler_rec is cyclic
struct___darwin_pthread_handler_rec
Unit
struct___darwin_pthread_handler_rec
void (void *)
1
Scala Native does not support this feature directly as far as I know so I believe that that the inner struct would have to be a Ptr[Byte]
and casting by the client code is needed to access the field.
Related: #77
Currently the test suite is failing. Not sure whether it is because it is expecting a specific OS version and LLVM version.
It would be good to fix the test suite to pass against one LLVM version and have a minimal CI build that runs the tests on each PR or pushed commit so we can start tracking that. I suggest we also remove any test that depend on system headers until we fix pruning unused types from the generated code.
Given that we have invested in setting up this project, for example created issues, I propose that we ask GitHub support to turn it into a standalone repository. This would make a lot of things easier such as creating PRs, which sometimes accidentally end up in mrRosset/scala-native-bindgen.
To detach the fork and turn it into a standalone repository on GitHub, contact GitHub Support. If the fork has forks of its own, let support know if the forks should move with your repository into a new network or remain in the current network. For more information, see "About forks."
Usually enums are represented as unsigned int
but it might also be one of: char
, signed integer type
, unsigned integer type
, currently all enum bindings have native.CInt
type, that should be changed.
Maybe this will help. At least it does output unsigned long
if enum value is too big for int
enumdecl->getIntegerType()
Currently in hand generated code there are tricks to make the code shorter and cleaner such as use of imports. A good example is the following:
https://github.com/scala-native/scala-native/blob/master/nativelib/src/main/scala/scala/scalanative/posix/dirent.scala
The keys points are the imports that eliminate the native.
and Nat.
prefixes. Also, in hand generated code the struct
is just named dirent
.
In the native types
test both the uchar.h
and stddef.h
header files are included which depending on the platform will generate additional type aliases and structs definitions for types read from those headers even if those types are unused.
In order to make it easy to generate bindings, users should be able to automatically download the binary from the GitHub release page or via an sbt plugin.
The binding generator should be able to both integrate with bindings for the standard library as provided by Scala Native as well as bindings provided by 3rd party libraries.
For example, one package provides bindings for gtk and another package bindings for a new gtk component. The component package should ideally just declare a dependency on the gtk binding and not have to include bindings for all of gtk.
One idea would be to have the binding generator create a file with types and information similar to the stdHeader.config file which can be packaged together with a binding and then loaded from the JAR.
Bindgen generates types in the order they appear in a header file.
Consider following example:
typedef struct points points;
typedef struct point point;
struct points {
point *point1;
point *point2;
};
struct point {
int x;
int y;
};
It will generate types in the following order:
type struct_points = native.CStruct2[native.Ptr[point], native.Ptr[point]]
type points = struct_points
type struct_point = native.CStruct2[native.CInt, native.CInt]
type point = struct_point
In generated code points
appear before point
although points
uses point
.
To sort types we need to know what types are used by other types and then do topological sort.
There is also one problem that should be considered, types may have cyclic dependency:
struct b;
struct c;
struct a {
struct b *b;
};
struct b {
struct c *c;
};
struct c {
struct a *a;
};
Document the required tools (CMake, make, LLVM with CMake definitions) for building scala-native-bindgen.
Add a tools
project to provide an API for build tools to easily use the generator. The API could be based on the builder pattern, e.g. https://docs.rs/bindgen/0.37.0/bindgen/struct.Builder.html
Use the tooling API to create a simple sbt project which can eventually be used by the tests and bindings (#59 )
Make sure the following code works:
struct point;
struct point *move(struct point *point, int x, int y);
struct point { int x; int y; };
At some point we should adopt the configuration and scripts used by Scala Native for consistency:
And check via Travis CI.
For standard C headers, __
and _
is often prefixed to symbols and types which are considered internal. Examples:
unsigned long ___runetype(__darwin_ct_rune_t);
__darwin_ct_rune_t ___tolower(__darwin_ct_rune_t);
__darwin_ct_rune_t ___toupper(__darwin_ct_rune_t);
In order to generate platform independent bindings we need to be able to filter out such symbols.
Bindings for uv.h
give following errors:
Error: type declaration for struct___fsid_t was not found.
Error: type declaration for struct___va_list_tag was not found.
Error: type declaration for struct__IO_FILE_plus was not found.
...
It happens when declaration of opaque type was not found.
Seems like it is not a problem but such cases should be handled carefully.
I checked couple of these types and they are used in one of following situations:
Byte
.uv.h
these variables are private, so they should be filtered.Byte
Write proper documentation for bindgen using Paradox
A sensible default could either be hard-coded or based on the path of the file.
Static linking is great for packaging but not great for local installation from source. Also it causes valgrind to complain. The fix should be to only statically link the program when -DSTATIC_LINKING=1
is passed to CMake.
Generating bindings for /usr/include/uv.h
gives lots of errors and does not output Scala code:
Failed to get declaration for struct __fsid_t
Failed to get declaration for struct _IO_FILE
Failed to get declaration for struct __va_list_tag
...
The problem is that typedef for struct can be located above struct declaration:
typedef struct myStruct mystruct;
struct myStruct {
int a;
};
So when TreeVisitor
visits typedef it is unable to translate struct myStruct
because type is not yet in TypeTranslator::aliasesMap
While working on https://github.com/jonas/scala-native-bindgen/commits/shared-ptr I noticed at least another use where the bit length returned by ctc->getTypeSize()
was not converted to bytes.
The goal of this ticket is to review all call sites and fix them, ideally with tests and a helper to do the conversion in a single place.
Add initial sbt project with generated bindings and tests for those bindings.
A possible candidate is bzip2 (https://rust-lang-nursery.github.io/rust-bindgen/tutorial-0.html) which should live under bindings/bzip2
.
The goal is for the initial bindings to serve as a template for future contributions. As such, steps to contribute new bindings should also be documented.
The following declarations are not currently supported.
extern int forty_two;
extern enum { SYSTEM, USER } who;
extern const char version[];
extern struct {
int major;
int minor;
int patch;
} semver;
AST
> clang -Xclang -ast-dump -fsyntax-only tests/samples/Extern.h
TranslationUnitDecl 0x7fcd6b004ae8 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x7fcd6b005060 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
| `-BuiltinType 0x7fcd6b004d80 '__int128'
|-TypedefDecl 0x7fcd6b0050d0 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
| `-BuiltinType 0x7fcd6b004da0 'unsigned __int128'
|-TypedefDecl 0x7fcd6b0053a8 <<invalid sloc>> <invalid sloc> implicit __NSConstantString 'struct __NSConstantString_tag'
| `-RecordType 0x7fcd6b0051b0 'struct __NSConstantString_tag'
| `-Record 0x7fcd6b005128 '__NSConstantString_tag'
|-TypedefDecl 0x7fcd6b005440 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list 'char *'
| `-PointerType 0x7fcd6b005400 'char *'
| `-BuiltinType 0x7fcd6b004b80 'char'
|-TypedefDecl 0x7fcd6b005708 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list 'struct __va_list_tag [1]'
| `-ConstantArrayType 0x7fcd6b0056b0 'struct __va_list_tag [1]' 1
| `-RecordType 0x7fcd6b005520 'struct __va_list_tag'
| `-Record 0x7fcd6b005498 '__va_list_tag'
|-VarDecl 0x7fcd6b005778 <tests/samples/Extern.h:1:1, col:12> col:12 forty_two 'int' extern
|-EnumDecl 0x7fcd6b849440 <line:2:8, col:28> col:8
| |-EnumConstantDecl 0x7fcd6b849500 <col:15> col:15 SYSTEM 'int'
| `-EnumConstantDecl 0x7fcd6b849548 <col:23> col:23 USER 'int'
|-VarDecl 0x7fcd6b8495e0 <col:1, col:30> col:30 who 'enum (anonymous enum at tests/samples/Extern.h:2:8)':'enum (anonymous at tests/samples/Extern.h:2:8)' extern
|-VarDecl 0x7fcd6b8496d8 <line:3:1, col:27> col:19 version 'const char []' extern
|-RecordDecl 0x7fcd6b849738 <line:4:8, line:8:1> line:4:8 struct definition
target
| |-FieldDecl 0x7fcd6b8497f8 <line:5:5, col:9> col:9 major 'int'
| |-FieldDecl 0x7fcd6b849858 <line:6:5, col:9> col:9 minor 'int'
| `-FieldDecl 0x7fcd6b8498b8 <line:7:5, col:9> col:9 patch 'int'
`-VarDecl 0x7fcd6b849950 <line:4:1, line:8:3> col:3 semver 'struct (anonymous struct at tests/samples/Extern.h:4:8)':'struct (anonymous at tests/samples/Extern.h:4:8)' extern
In preparation for including bindings we should probably reorganize the repository soon.
One option would be:
Based on stability and progress, it might also make sense to have a tooling and sbt-plugin to make it possible to integrate the binding generator into existing projects.
typeUsesOtherType function checks if type is used in other type by simply comparing strings.
This approach will fail on more complex types.
It would be better not to convert a type directly to string but to store it in intermediate representations.
There will be a class Type
that stores either string (for example native.CInt
) or a pointer to other IR instance: Struct
, Union
or Enum
.
Also there will be following subclasses of Type
:
PointerType
ArrayType
FunctionPointerType
All above classes will store pointer(s) to other types
Scala code will not compile if there are structs that are passed by value.
Example:
struct MyStruct {
int a;
};
struct OtherStruct {
struct MyStruct s;
};
This Scala code will not compile:
@native.link("Lib")
@native.extern
object Lib {
type struct_MyStruct = native.CStruct1[native.CInt]
type struct_OtherStruct = native.CStruct1[struct_MyStruct]
}
import Lib._
object LibMembersHelpers {
implicit class struct_OtherStruct_ops(val p: native.Ptr[struct_OtherStruct]) extends AnyVal {
def s: struct_MyStruct = !p._1 // error
def s_=(value: struct_MyStruct):Unit = !p._1 = value // error
}
}
One possible solution - remove declarations that use structs by value.
Another - generate C code that converts struct to a pointer
Variadic function uses parameter with name varArgs
but this name may be used by previous parameter:
def variadic_args(varArgs: native.CInt, varArgs: native.CVararg*)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.