GithubHelp home page GithubHelp logo

google / autocxx Goto Github PK

View Code? Open in Web Editor NEW
2.0K 2.0K 131.0 6.98 MB

Tool for safe ergonomic Rust/C++ interop driven from existing C++ headers

Home Page: https://docs.rs/autocxx

License: Apache License 2.0

Rust 99.73% Shell 0.16% JavaScript 0.01% C++ 0.11%
bindgen cxx rust

autocxx's Introduction

Autocxx

GitHub crates.io docs.rs

This project is a tool for calling C++ from Rust in a heavily automated, but safe, fashion.

The intention is that it has all the fluent safety from cxx whilst generating interfaces automatically from existing C++ headers using a variant of bindgen. Think of autocxx as glue which plugs bindgen into cxx.

For full documentation, see the manual.

Overview

autocxx::include_cpp! {
    #include "url/origin.h"
    generate!("url::Origin")
    safety!(unsafe_ffi)
}

fn main() {
    let o = ffi::url::Origin::CreateFromNormalizedTuple("https",
        "google.com", 443);
    let uri = o.Serialize();
    println!("URI is {}", uri.to_str().unwrap());
}

License and usage notes

This is not an officially supported Google product.

Licensed under either of Apache License, Version 2.0 or MIT license at your option.

autocxx's People

Contributors

aaronrsiphone avatar adetaylor avatar badicsalex avatar benesch avatar bsilver8192 avatar cad97 avatar chbaker0 avatar dependabot[bot] avatar dtolnay avatar fzyzcjy avatar gabriel-viviani avatar htynkn avatar kitlith avatar lukesneeringer avatar martinboehme avatar milahu avatar minaminao avatar nak3 avatar philipcraig avatar psmit avatar russelltg avatar scullionw avatar silensangelusnex avatar silvanshade avatar ssbr avatar tako8ki avatar thomaseizinger avatar virxec avatar yshui avatar yuxuan-xie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autocxx's Issues

We try to generate UniquePtr wrappers for forward declared classes

I think:

class Foo;

class Bar {
   void do_something(const Foo* foo);
};

will cause trouble. Whilst a Foo* is OK with a forward-declared type, a std::unique_ptr<Foo> is not OK, because unique_ptr wants to know if it has a destructor. That's actually legitimately important for our purposes because such a Foo may get dropped within Rust.

demo fail to build

Expected Behavior

build and run demo

Actual Behavior

error: linking with cc failed: exit code: 1
|
= note: "cc" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-m64" "-Wl,--eh-frame-hdr" "-L" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.1c4ndla2a5ux5v3z.rcgu.o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.2h2pdaobccn6zeof.rcgu.o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.3g4mwhrnanfzo93l.rcgu.o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.3r8mkb28nvaav05s.rcgu.o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.4ap5xtkavab8p9x5.rcgu.o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.5f07d58ku19x5yd8.rcgu.o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.9ji2iqnqcygp0ue.rcgu.o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.jv3t67c94wp01u8.rcgu.o" "-o" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92" "/home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.4og2nil6wslwmztf.rcgu.o" "-Wl,--gc-sections" "-pie" "-Wl,-zrelro" "-Wl,-znow" "-nodefaultlibs" "-L" "/home/user1/autocxx-example/target/debug/deps" "-L" "/home/user1/autocxx-example/target/debug/build/autocxx-example-a1229e1cdb508885/out" "-L" "/home/user1/autocxx-example/target/debug/build/cxx-921c9432ffb813bf/out" "-L" "/home/user1/autocxx-example/target/debug/build/link-cplusplus-c635e2f5a57104e6/out" "-L" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "-Wl,--whole-archive" "-lautocxx-example" "-Wl,--no-whole-archive" "-Wl,-Bdynamic" "-lstdc++" "-Wl,-Bstatic" "/home/user1/autocxx-example/target/debug/deps/libcxx-2d7be6e04495aa40.rlib" "/home/user1/autocxx-example/target/debug/deps/liblink_cplusplus-6f18b49319e43bc3.rlib" "-Wl,--start-group" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-cf0f33af3a901778.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-daf8c2d692e6eca4.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-24e8f97647425e48.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-85ed7d2b484c05a9.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libbacktrace-89de2c581262ec09.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libbacktrace_sys-3b0db98e62ed7d75.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-c60847f9a163de82.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-0bb9b63424f4fc5d.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-3f74d829e37fa40e.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-0e9d83ff06f1a7ad.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-2c8c904efaf7c40b.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-cbfb51de52131460.rlib" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-97497c26fddb7882.rlib" "-Wl,--end-group" "/home/user1/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-f1a9d8c443e20b5e.rlib" "-Wl,-Bdynamic" "-lstdc++" "-ldl" "-lrt" "-lpthread" "-lgcc_s" "-lc" "-lm" "-lrt" "-lpthread" "-lutil" "-ldl" "-lutil"
= note: /home/user1/autocxx-example/target/debug/deps/autocxx_example-ad7f19bfc3950f92.jv3t67c94wp01u8.rcgu.o: In function autocxx_example::ffi::cxxbridge::DoMath': /home/user1/autocxx-example/src/main.rs:3: undefined reference to cxxbridge04$DoMath'
collect2: error: ld returned 1 exit status

Steps to Reproduce the Problem

  1. cargo build

Specifications

  • Version:
    autocxx v0.2.0

  • Platform:
    Ubuntu 18.04 LTS

Output .d files

We should figure out a way to output a .d file such that build systems can be educated about the .h files on which we depend, and their transitive dependencies. We should look into whether/what bindgen does here already.

`inline` functions defined in C++ header can't be found.

Defining a function in a header file with inline causes the following error:

error[E0425]: cannot find function `calculate` in module `ffi::cxxbridge`
 --> $DIR/input.rs:1:137
  |
1 | ... fn main () { assert_eq ! (ffi :: cxxbridge :: calculate () , 42) ; } # [link (name = "autocxx-demo")] extern { }
  |                                                   ^^^^^^^^^ not found in `ffi::cxxbridge`
┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈

Removing inline from the declaration and moving the definition of the function the cxx file fixes the error, but some C++ libraries define functions in the header using inline as an optimization, so I would expect autocxx to be able to find them.

// integration.rs
let hdr = indoc! {"
    #include <cstdint>
    inline uint32_t calculate() { return 42; }
"};
let cxx = indoc! {"
"};
let rs = quote! {
    assert_eq!(ffi::cxxbridge::calculate(), 42);
};
run_test_with_full_pipeline(cxx, hdr, rs, &[
    "calculate",
], &[]);

LLVM requirement link in README points to print page

If you follow this link in README:

It requires llvm to be installed due to bindgen

It'll lead to a print page (print.html in the url). I think the print.html component should be removed.

I would have filed a PR but I'm not signing any CLA unless Google pays me money.

Wrong offset of CxxString field when preceded by padding

For this input header, autocxx is compiling the following generated Rust code, which puts the s field at offset 4 in S while the C++ string field is actually at offset 8 on my machine.

#pragma once
#include <cstdint>
#include <string>
struct S {
  explicit S(uint32_t i);
  uint32_t i;
  std::string s;
};
mod ffi {
    pub type __uint32_t = ::std::os::raw::c_uint;
    unsafe impl cxx::ExternType for bindgen::S {
        type Id = cxx::type_id!("S");
        type Kind = cxx::kind::Opaque;
    }
    mod bindgen {
        #[repr(C)]
        pub struct S {
            pub i: u32,
            pub s: ::cxx::CxxString,
            pub __bindgen_padding_0: [u64; 3usize],
        }
        impl S {
            pub fn make_unique(i: u32) -> cxx::UniquePtr<Self> {
                super::cxxbridge::S_make_unique(i)
            }
        }
    }
    #[cxx::bridge]
    pub mod cxxbridge {
        impl UniquePtr<S> {}
        extern "C" {
            pub fn S_make_unique(arg0: u32) -> UniquePtr<S>;
            include!("input.h");
            include!("autocxxgen.h");
            type S = super::bindgen::S;
        }
    }
}

Handle pointers

At the moment, we're not handling pointers which may be null. We're just converting any pointers that we discover into references.

We probably need to convert Foo* into Option<&Foo> rather than Foo. This will probably require cxx support first, and/or wrapper functions.

It's also not clear whether the information output from bindgen even distinguishes references from pointers enough for us to make this distinction. TBD.

#ifdef support

It's my intention that we can support Rust compile-time feature enablement based on C++ macros.

Where users are using our build.rs support in https://github.com/google/autocxx/tree/main/gen/build this should really be pretty trivial by printing cargo:rustc-cfg lines to standard out as we're generating the C++ bindings.

I haven't thought about the exact syntax but obviously the goal is to allow something like #[cfg(cxx_ifdef = ENABLE_FEATURE)] which can disable a block of code in Rust.

I also haven't thought about #if but that should be achievable in a similar fashion. Probably.

This should be simple for the build.rs case but we need to figure out:

  • How to do this for the non-Cargo case (https://github.com/google/autocxx/tree/main/gen/cmd) in a way that's easy to integrate into third party build systems. We probably need to write all the extra rustc arguments to a file which build systems can use as input to a subsequent rustc command line.
  • How on earth to do this for the integration tests where we just don't launch another rustc instance at all. I don't have a plan there.

I'm also concerned that the sheer number of #define symbols are likely to overwhelm the rustc command line so we may need to be selective.

Namespaces: final work to hierarchize cxxbridge internal mod

After #76 we will still have a bit more namespace work to do. The final output structure should be correct, but within our internal cxxbridge mod we'll still have a flat structure. That means we won't allow input where there are identically-named symbols in different namespaces.

To finally fix that, we need to do the mod parts of dtolnay/cxx#353 and then reflect in autocxx.

Add tests for two classes with methods

We should add an integration test (to integration_tests.rs) which has two structs or classes, each with methods.

If autocxx is behaving right, it will generate the receivers as self: &StructName and cxx will handle it right. If autocxx gets it wrong, they will both be generated as simply self and cxx will barf.

Related comment: dtolnay/cxx#370 (comment)

Nested types don't work

class A {
   class B {
       ...
   };
};

is generated by bindgen as

pub struct A {
   ...
}

pub struct A_B {
   ...
}

This then gives errors when we refer to A_B in generated C++.

Types without constructors do not gain a make_unique

If.a struct/class has a constructor, we generate a make_unique associated function. We don't do that for types which have no constructors. We (probably?) should at least for non-POD types, calling the default C++ constructor.

This may be difficult depending on whether we can find out that the type has a private non-default constructor.

Assuming too high alignment of opaque types

The last commit of #25 missed the #[repr(C, packed)] attribute from my link. Without that, autocxx is producing opaque types that are something like:

struct Opaque {
    do_not_attempt_to_allocate_nonpod_types: [*const u8; 0],
}

which has alignment of 8 (or 4).

If the real alignment of the C++ type is smaller and a reference is returned from C++ to Rust, mere existence of an insufficiently aligned reference in Rust causes UB even if never dereferenced by Rust code (see https://doc.rust-lang.org/1.47.0/reference/behavior-considered-undefined.html). Rustc can use least-significant bits of the reference for other storage.

Figure out a plan for functions that return references but take more than one reference

cxx allows C++ functions which return references, but only if they take exactly one reference parameter (since then the lifetimes are obvious).

As our hope is to handle existing C++ APIs, we will encounter functions which return references and take 0 or 2+ reference parameters.

If they're methods, autocxx may try to generate them even if they're not in an explicit generate list.

So, we either need to silently skip them, or find some way that their behavior can be made unambiguous in a directive within the include_cpp! macro.

Field access to opaque types

There are some C++ types where autocxx just says "nope, we can't pass it by value" - autocxx calls these non-POD types; cxx calls them Opaque (see #17 for the terminology difference).

For those, autocxx offers no means to access the fields.

Rust cannot and should not attempt to access those fields, because we can't know the correct offsets. It's my intention that we should autogenerate accessor methods. Specifically, here's what I think we need to do.

  1. For any non-POD type, when we discover the fields as output by bindgen, add a new type of AdditionalNeed to add an accessor method - https://github.com/google/autocxx/blob/main/engine/src/additional_cpp_generator.rs#L101. (Possibly we just want to do this for public fields or similar; I haven't checked if that information is exposed by bindgen or if it can be configured using bindgen options).
  2. Generate a getter method with some plausible name within additional_cpp_generator.rs.
  3. This will automatically be picked up during the second invocation of bindgen within autocxx, and therefore we'll get Rust bindings to this method within our ffi::cxxbridge generated code.
  4. Now, make it work for fields which we can't safely pass by value because those fields themselves are non-POD types, e.g. std::string. One option here is to do a third pass of bindgen, which would replace fn thingy_getter(thingy: &Thingy) -> CxxString automatically with fn thingy_getter(thingy: &Thingy) -> UniquePtr<CxxString> and all would be well. But we don't really want to run bindgen three times... it's slow. So it would probably be better to add knowledge of these cases into additional_cpp_generator.rs itself.
  5. Do setters as well.
  6. Add syntactic sugar so it's less obvious that we're calling a method to get or set the field. I'm not sure how best to do this. #37 is in the area, as is #31.
  7. Replicate this getting/setting syntax for POD types, where it will boil down to a direct field access in pure Rust, but with identical syntax.
  8. Out of curiosity, see whether the generated assembly code is actually the same when cross-language LTO is used (see #52).

Possible way to support field accesses with offset known only by C++

Follow-up to #19 (comment). Here is a proof of concept (playground).

// Suppose we have no idea what the true size/alignment of std::string is but
// want a Rust struct which behaves like:
//
//     struct S {
//         std::string i;
//         std::string j;
//         uint32_t k;
//     };

use std::fmt::{self, Debug};

#[repr(C)]
pub struct CxxString([u8; 0]);

#[repr(C)]
pub struct S {
    pub i: CxxString,
    _rest: (),
}

impl Debug for S {
    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter
            .debug_struct("S")
            .field("i", &self.i)
            .field("j", &self.j)
            .field("k", &self.k)
            .finish()
    }
}

#[repr(C)]
pub struct _Has_j {
    pub j: CxxString,
    _rest: (),
}

#[repr(C)]
pub struct _Has_k {
    pub k: u32,
    _rest: (),
}

impl std::ops::Deref for S {
    type Target = _Has_j;
    fn deref(&self) -> &Self::Target {
        unsafe {
            &*(self as *const S)
                .cast::<u8>()
                .offset(foreign::_S_i_to_j())
                .cast::<_Has_j>()
        }
    }
}

impl std::ops::Deref for _Has_j {
    type Target = _Has_k;
    fn deref(&self) -> &Self::Target {
        unsafe {
            &*(self as *const _Has_j)
                .cast::<u8>()
                .offset(foreign::_S_j_to_k())
                .cast::<_Has_k>()
        }
    }
}

impl Debug for CxxString {
    fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("\"TODO\"")
    }
}

// Implemented in C++.
mod foreign {
    pub extern "C" fn _S_i_to_j() -> isize {
        // return offsetof(S, j) - offsetof(S, i);
        32
    }
    pub extern "C" fn _S_j_to_k() -> isize {
        // return offsetof(S, k) - offsetof(S, j);
        32
    }
}

pub fn print_k_get_j(s: &S) -> &CxxString {
    println!("{}", s.k);
    return &s.j;
}

Speed improvement: run bindgen only once

At present here's how it all works.

  1. Run bindgen
  2. Examine the generated Rust bindings. See if any additional C++ code is required (beyond that which cxx already generates).
  3. If so, generate more C++ code.
  4. Run bindgen again.
  5. Examine the generated Rust bindings. Convert them to a #[cxx::bridge] mod { ... }
  6. Pass them to cxx for final Rust and C++ code gen.

Step 4 needs to be zapped, as in a realistic project it's going to make things too slow. At step 3, https://github.com/google/autocxx/blob/main/engine/src/additional_cpp_generator.rs needs to generate the C++ code but also generate more instructions for cxx which can be passed directly into cxx at step 5/6.

Figure out better diagnostics for cargo build script case

At the moment, autocxx_engine uses log::info and similar for logging. This is especially valuable to see the bindgen output Rust code, and then the converted Rust code which we pass into cxx.

This emerges successfully when we run the test suite (if we give RUST_LOG=autocxx_engine=info), but there's no (known) way to see that output when we run a simple cargo build, for example in the demo directory.

This will present a larger problem when we start to apply cxx to larger codebases.

It would be nice to see if there's a way we can get the log output to emerge via cargo:warning=MESSAGE (per instructions here), though obviously not by default.

Static methods don't work

Expected Behavior

Enable the test_static_func integration test kindly added by @SilensAngelusNex. Should pass.

Actual Behavior

Fails.

Diagnosis

At the moment there's no way we can add an associated method to a type in cxx as far as I know. We probably want to wait to see how dtolnay/cxx#464 pans out, then add support. (See also #88).

Error messages are dreadful

See #33 for an example of where we should make an effort here. I need to figure out the way cxx does this and follow that as best practice.

Consts in namespaces

Pretty sure that a constant in a namespace currently ends up not being in the namespace in the output Rust code.

I'd mark this as "good first issue" but I'm currently doing quite major surgery in this area.

Asynchronous operation

At the moment this project generates all its Rust code in a procedural macro, invoking bindgen (and thus llvm) during Rust compilation to parse C++ headers and convert to a Rust token stream.

Whilst it's rather awesome that it works, this has the following downsides:

  • We parse the C++ headers at least twice (once during Rust compilation and once during generating C++ bindings)
  • We need to do the work all over again for each time the Rust is compiled
  • Some say it's not the way procedural macros are supposed to behave

Since we have to run a tool to generate the C++ side of the bindings anyway, I intend to switch to a mode where:

  • That tool generates both Rust and C++ side of the bindings
  • On compilation, the include_cpp! macro (or future equivalent per #42) simply expands to an include! macro invocation which pulls in the generated code.

This should be straightforward, although it's my intention that the filename of the generated code may want to depend upon the full parameters passed to the include_cpp! macro, to ensure there's lower risk of pulling in outdated generated code, so this may still need to be a small procedural macro.

Struct with non-POD fields can be improperly constructed from Rust

// src/input.h

#pragma once
#include <string>
struct T {
  std::string s;
};
void f(const T& t);
// src/input.cc

#include "input.h"
#include <iostream>

void f(const T& t) {
  std::cout << t.s << std::endl;
}
// src/main.rs

use autocxx::include_cxx;

include_cxx!(Header("input.h"), Allow("T"), Allow("f"));

fn main() {
    ffi::cxxbridge::f(&ffi::cxxbridge::T {});
}
$  cargo run
Segmentation fault (core dumped)

Typedefs

Need to support typedefs in C++ code. These are correctly output by bindgen but not absorbed by the byvalue_checker so result in errors.

Method calls passing non-POD parameters have icky syntax

Expected Behavior

struct Bob {
public:
    Anna get_anna() const; // Anna is a non-POD type containing a std::string
};
let b = ffi::cxxbridge::Bob { a: 12, b: 13 };
let a = b.get_anna(); // returns non-POD type by value, converted to UniquePtr<Anna>

Actual Behavior

let b = ffi::cxxbridge::Bob { a: 12, b: 13 };
let a = ffi::cxxbridge::get_anna(&b); // syntax currently necessary

Notes

This is a known limitation of the work in #28. We are generating extra wrapper functions for get_anna to do the UniquePtr conversion (get_anna_up_wrapper), and we currently have no means to attach them as methods to Bob.

POD structure constants

We should support C++ string constants. We already support int constants.

Later today I will commit an #[ignore]d test for this, integration_tests::test_pod_constant.

bindgen does output the constant, like this:

        extern "C" {
            #[link_name = "\u{1}__ZL3BOB"]
            pub static BOB: root::Bob;
        }

so it should just be a matter of educating bridge_converter.rs that this is OK, and adding a use in the final output ffi mod.

Information discarded by bindgen which we need (tracker bug)

bindgen doesn't pass on quite all the information we need in order to be able to generate autocxx bindings. Options in the future might be to fork bindgen, ask very nicely if they mind us upstreaming patches to add metadata, or switch away from bindgen and use llvm directly.

In any case this bug will be a live tracker of all known cases where we don't have all the information that we require about the underlying C++.

  • Classes vs structs. Both present as structs in the bindgen output. When we then generate extra C++ we have to plump for one or the other. In some ABIs this will result in a binary incompatibility or failure to compile. Per #54.
  • Overloaded methods vs methods ending in digits. Two methods a(uint32_t) and a(uint8_t) will be output as a and a1. There is no way for us to distinguish that from methods really called a and a1. Worse, this is across all the types in a namespace, so MyStruct::a and MyOtherStruct::a become MyStruct::a and MyOtherStruct::a1.
  • Nested types. A struct nested inside another struct is not noted in the bindgen output, so we can't generate compatible C++ code - #115.
  • Private constructors. Per #122.
  • Pointers vs references. Per #102.
  • Virtual functions. Per #195 and #305.
  • When template params are unused, per #414 and #416.
  • Deleted functions, per #426.

Non-POD structure constants

A follow up from #93, much harder.

We need to figure out how to expose C++ constants which are types that cannot be represented in Rust by value.

We probably need to generate some C++ code which will return a UniquePtr to a copy of the constant. This assumes that the constant can be copied, and obviously has performance implications.

I'm adding an #[ignore]d test - test_non_pod_constant.

Duplicated repr attributes

cargo expand shows that the structs in the bindgen sub-mod now have two repr attributes.

        #[repr(C)]
        #[repr(C, packed)]

Not causing any problems, but should fix.

Crate fails to build on Windows

Unfortunately, autocxx fails to build on Windows, due to compile errors in the dependency osstrtools.

Steps to reproduce:

cargo new autocxx-example
cd autocxx-example
cargo add autocxx
cargo build

Errors:

   ...
   Compiling humantime v1.3.0
   Compiling indoc v1.0.3
   Compiling osstrtools v0.2.2
error[E0432]: unresolved import `os_str_bytes`
   --> ~\.cargo\registry\src\github.com-1ecc6299db9ec823\osstrtools-0.2.2\src\lib.rs:679:13
    |
679 |         use os_str_bytes::OsStringBytes;
    |             ^^^^^^^^^^^^ use of undeclared type or module `os_str_bytes`

error[E0432]: unresolved import `os_str_bytes`
   --> ~\.cargo\registry\src\github.com-1ecc6299db9ec823\osstrtools-0.2.2\src\lib.rs:686:13
    |
686 |         use os_str_bytes::OsStringBytes;
    |             ^^^^^^^^^^^^ use of undeclared type or module `os_str_bytes`

error[E0432]: unresolved import `os_str_bytes`
   --> ~\.cargo\registry\src\github.com-1ecc6299db9ec823\osstrtools-0.2.2\src\lib.rs:742:13
    |
742 |         use os_str_bytes::OsStrBytes;
    |             ^^^^^^^^^^^^ use of undeclared type or module `os_str_bytes`

error[E0405]: cannot find trait `WinOsStr` in this scope
   --> ~\.cargo\registry\src\github.com-1ecc6299db9ec823\osstrtools-0.2.2\src\lib.rs:740:6
    |
740 | impl WinOsStr for OsStr {
    |      ^^^^^^^^ not found in this scope

error[E0425]: cannot find value `bytes` in this scope
   --> ~\.cargo\registry\src\github.com-1ecc6299db9ec823\osstrtools-0.2.2\src\lib.rs:754:19
    |
754 |         unsafe { (bytes as *const _).cast:() }
    |                   ^^^^^                  - help: maybe you meant to write a path separator here: `::`
    |                   |
    |                   not found in this scope

error: aborting due to 5 previous errors

I reported the issue at the osstrtools's repo, however the crate seems a bit abandoned.

At a first look, I could only find split() from that crate being used:

let inc_dirs = inc_dirs.split(&splitter[0..1]);

Is it maybe an option to get that functionality without that crate, or use an alternative?

Nicer syntax for `include_cpp`

The current syntax:

include_cpp!(Header("bob.h"), Allow("get_foo"), Allow("Bar"))

has four problems:

  • It's ugly.
  • It doesn't allow the name of the ffi mod to be specified.
  • It doesn't allow specification of whether the ffi mod is pub, etc.
  • There's no good way of attaching documentation to directives like Allow, etc.

I am thinking of switching to a syntax a bit more like cxx:

#[autocxx::bridge]
pub mod ffi {
   include!("bob.h") # or maybe include_cpp!
   allow!("get_foo")
   allow!("Bar")
}

Hopefully I can persuade rustdoc to treat all the inner macros as document-able things. If not, they may become functions (which do nothing when called as actual functions, but can have docs attached). The problem there is that a single crate can either export a macro or functions. We'll see.

I should do this at the same time as #37 to have one major breaking change, though I don't think anyone's relying on the existing syntax yet.

The new syntax isn't perfect, since I wanted the include_cpp! line to act as much as possible like #include in C++. So thinking still in progress.

make_unique should be an associated function

Right now, for a type called (e.g.) Bob, we generate a function called fn Bob_make_unique() -> UniquePtr<Bob>. It would be nicer if we can make that an associated function on the Bob type. This probably requires dtolnay/cxx#464 (or, of course, we do dtolnay/cxx#280 to add constructor support directly into cxx and remove it from autocxx).

Experiment with generating all

At the moment, it's mandatory to provide one or more allow directives. They builds the allowlist which we feed to bindgen. However, if no allowlist is fed to bindgen it attempts to generate bindings for everything which it encounters in the header files.

It might be worth adding an allow_all directive which bypasses the creation of the NoAllowlist error but has no other effect: this should be sufficient to ask bindgen to generate bindings for everything.

In practice, with realistically complex headers (i.e. using STL), I expect problems correctly interpreting some of the items in the headers. But that's probably a good reason to try.

Namespaces

A big intentional omission from the autocxx work so far has been C++ namespaces.

I'm now tackling this.

Static functions don't work

#[test]
fn test_static_function() {
    let cxx = indoc! {"
        Bob Bob::create() { Bob a; return a; }
    "};
    let hdr = indoc! {"
        #include <cstdint>
        struct Bob {
            uint32_t a;
            static Bob create();
        };
    "};
    let rs = quote! {
        ffi::Bob::create();
    };
    run_test(cxx, hdr, rs, &[], &["Bob"]);
}

We are not turning create into an associated function as we should.

Alias all functions and symbols to a single namespace

Right now, to use autocxx-generated bindings from Rust, you need to refer to:

  • ffi::cxxbridge::W for struct W
  • ffi::cxxbridge::X for struct X
  • ffi::Y for constant Y
  • ffi::defs::Z for preprocessor symbol Z

(I think).

This all needs tidying up; at least the first two should be aliased so you can refer to them using just ffi::W; ffi::Z.

It might be that ffi also needs to be made changeable in case there are multiple include_cpp macros in a file.

Classes result in build warnings; build failure on Windows

A struct with methods gets processed nicely by bindgen, autocxx and cxx.

A class with methods results in compile warnings because:

  • bindgen outputs a Rust struct
  • autocxx passes struct-like information to cxx
  • cxx generates C++-side bindings which include a forward declaration for this class... but calls it a struct. Therefore the compiler whines. Apparently on Windows this would actually be a bug due to ABI mismatch between structs and classes. (In the title of the issue I call this a 'build failure' but I'm not sure).

Currently the information output from bindgen doesn't tell us whether something is a class or a struct. We will need a small bindgen change to add a bit more information to the outputted code. I have a hacky branch for this; if anyone wants to clean it up let me know and I'll upload the hacks somewhere.

Dependent qualified types don't work

Given this code:

/**
* <div rustbindgen=\"true\" replaces=\"std::string\">
*/
class FakeString {
    char* ptr;
};

#include <string>
#include <cstdint>

template <typename STRING_TYPE>
class BasicStringPiece;

typedef BasicStringPiece<std::string> StringPiece;

template <typename STRING_TYPE> class BasicStringPiece {
public:
    typedef size_t size_type;
    typedef typename STRING_TYPE::value_type value_type;
    const value_type* ptr_;
    size_type length_;
};

struct Origin {
    // void SetHost(StringPiece host);
    StringPiece host;
};

bindgen emits this:

pub type StringPiece = BasicStringPiece;
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct BasicStringPiece {
    pub ptr_: *const BasicStringPiece_value_type,
    pub length_: BasicStringPiece_size_type,
}
pub type BasicStringPiece_size_type = size_t;
pub type BasicStringPiece_value_type = [u8; 0usize];
#[repr(C)]
pub struct Origin {
    pub host: StringPiece,
}

which is fine from a Rust point of view, but no good for us because the SetHost function would get C++ bindings generated as std::unique_ptr<BasicStringPiece> instead of std::unique_ptr<StringPiece> or std::unique_ptr<BasicStringPiece<std::string>>.

Mismatch in size of std::string

On my machine this input header results in the following generated Rust struct, which looks reasonable, except the Rust side and C++ side disagree about the size of std::string.

#pragma once
#include <cstdint>
#include <string>
struct S {
  explicit S(uint32_t i);
  std::string s;
  uint32_t i;
};
// generated by autocxx:

#[repr(C)]
pub struct S {
    pub s: ::cxx::CxxString,
    pub __bindgen_padding_0: [u32; 6usize],
    pub i: u32,
}

I'll share a PR to help reproduce. Possibly bindgen run by autocxx is finding a different set of system headers than the C++ build?

Here is the debug printing I used to confirm the issue:

S::S(uint32_t i) : i(i) {
  std::cout << "C++:" << (size_t(&this->i) - size_t(this)) << std::endl;
}
let s = ffi::cxxbridge::S_make_unique(1);
println!("Rust:{}", (&s.i as *const _ as usize) - (&*s as *const _ as usize));
println!("{}", s.i);

Expected output is that C++ and Rust print the same size and the third line is 1, but the actual output is:

C++:32
Rust:24
0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.