GithubHelp home page GithubHelp logo

Comments (48)

sdroege avatar sdroege commented on June 13, 2024 2

Related rust-lang/rust-memory-model#41

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Note: there was long discussion about enums in gtk-rs/gdk#154

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Problem:
While enum copy with FromGlib implementation was intended to fix this issue,
current implementation
actually don't fix it as wrong ffi-side enum values actually count as one of cases.
Also copying enum introduced enums with different layout from FFI-side enum and even different integer values for same name.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Solution 1.
Just rexport enums from FFI-side.
Proposed by @johncf. He even added PR for it #354.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Solution 2.
Stay with copy but change implementation to match integers. This way default branch with panic can be added.
Update: @sdroege proposed also add UNKNOWN case to all ffi enums too.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Solution 3.
gtk-rs/gdk#154 (comment)
Proposed by @GuillaumeGomez. I don't understand it 😭

from gir.

GuillaumeGomez avatar GuillaumeGomez commented on June 13, 2024

🤣

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

@GuillaumeGomez, @sdroege as #354 appeared again, we need decide what to do.

from gir.

GuillaumeGomez avatar GuillaumeGomez commented on June 13, 2024

So to put it simply: we have the choice between safety and performance. So I propose we add both (the performance will be _unchecked).

from gir.

sdroege avatar sdroege commented on June 13, 2024

What exactly is the proposal to do here, the API to be added?

from gir.

johncf avatar johncf commented on June 13, 2024

So to put it simply: we have the choice between safety and performance.

If I remember correctly, the status quo right now is no more safer than what remains after the changes in #354.

from gir.

GuillaumeGomez avatar GuillaumeGomez commented on June 13, 2024

@johncf: I agree on this, that's why I propose to add both. The "safe" one will use match and the "performant" one will just cast.

from gir.

johncf avatar johncf commented on June 13, 2024

@GuillaumeGomez That change can come later after we land these changes.

Basically, this was what you had proposed to do while converting from integers to enums. That would be a pervasive change too with a more profound effect overall.

from gir.

johncf avatar johncf commented on June 13, 2024

@GuillaumeGomez Actually it can be done a bit more efficiently like this:

#[repr(C)]
pub enum Test {
   A = 0,
   B = 1,
   C = 2,
   D = 3,
}

fn toTest(v: u32) -> Test {
    match v {
        0..4 => unsafe { std::mem::transmute(v) },
        _ => panic!("Bad!"),
    }
}

Then again, I should remind you that what this would prevent are only rare upstream bugs. Rust users will never need to use this function (even indirectly).

from gir.

sdroege avatar sdroege commented on June 13, 2024

The 0..4 match assumes that enum values are incremented by 1, which is not necessarily the case. You could have an enum that has all fibonacci numbers as values ;) And the code generator would have to check the enum values to detect cases where matches can be compressed (also I would assume that the compiler could do exactly that for us).

from gir.

GuillaumeGomez avatar GuillaumeGomez commented on June 13, 2024

@johncf: Wo, you just solved both issues! However it's not for sure that every values follow in the enums unfortunately. :-/

from gir.

johncf avatar johncf commented on June 13, 2024

@sdroege @GuillaumeGomez Yes, true. But most of the time, they are... But that was not the point I was trying to make!! 😭 It's that plain u32 will not come from the Rust side. So whether that change is worth the cost itself should be considered carefully.

from gir.

johncf avatar johncf commented on June 13, 2024

Anyways, as far the current changes are concerned, if everyone's on-board, I'll proceed to create a Gtk PR (which will also be huge ~4000 lines reduction).

from gir.

GuillaumeGomez avatar GuillaumeGomez commented on June 13, 2024

Fine by me.

from gir.

sdroege avatar sdroege commented on June 13, 2024

I think it's all fine as-is, let's worry about that another time. Also for bitflags you probably want to use the from_bits_truncate then for the same reason.

In the case of enums (and flags), this can btw happen without a bug anywhere if the bindings were generated for an older version of the C library, and a new enum value or flag was added.

from gir.

johncf avatar johncf commented on June 13, 2024

Alright, then I'll PR Gtk and Pango repositories soon.

from gir.

sdroege avatar sdroege commented on June 13, 2024

from_bits_truncate seems to be used for flags already, so that point is solved. For enums I think we should really check them for validity instead of transmuting (as that will cause undefined behaviour), and it can happen without any bugs anywhere (API additions in the C library, older Rust library).

Panicking seems a bit extreme though, maybe we can match all other values to the __Nonexhaustive variant... and possibly print a warning. Any newly added enum value to the C library must be safe to be ignored, otherwise the C library changed API (i.e. is not backwards compatible anymore to the older version) and must've been reflected in the C library name already (at least!): libglib-2.0.so.0 would become libglib-2.0.so.1, or even better because this is an API change and not just an ABI change libglib-3.0.so.0.

from gir.

sdroege avatar sdroege commented on June 13, 2024

Update: @sdroege proposed also add UNKNOWN case to all ffi enums too.

@EPashkin We can't add an unknown case to the FFI enums, as it might conflict with the C enum at a later time (if you use a plain value), or is not possible with repr(C) (if you do some kind of Unknown(u32) or a ZST).

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

@sdroege Yes. This why we redefine enums without repr(C) and added __Nonexhaustive(()) too prevent int conversion. So library user mostly don't see ffi enums.

from gir.

sdroege avatar sdroege commented on June 13, 2024

@sdroege Yes. This why we redefine enums without repr(C) and added __Nonexhaustive(()) too prevent int conversion. So library user mostly don't see ffi enums.

Yes that's all good :) The problem is that on the FFI side there might be values returned that are not part of the FFI enum. But as those are repr(C) and with integers, maybe it's ok if those are transmuted to an integer later, and then passed through a conversion trait for getting the non-FFI enum.

from gir.

sdroege avatar sdroege commented on June 13, 2024

So it seems to do anything safely here we could do:

  1. need some changes to how repr(C) enums work in Rust (so it's safe to pass out-of-enum values around)

  2. still duplicate the enums (sys and non-sys), so that the latter can be non-repr(C) and thus contain __Nonexhaustive (ZST can't be in repr(C)), or even better something to carry "unknown" enum values around (__Unknown(u32))

  3. have conversion traits implemented that do the mapping between the repr(C) enum and the proper Rust enum, handling unknown values in a meaningful way without panicking

  4. and 3) can be done on our side already, 1) is already broken in the current situation but likely does not cause any immediate problems as long as we just pass around the FFI enums, and convert them to some kind of integer for matching/conversion to the non-repr(C) enum

Comments?

Edit: Instead of not doing anything about 1), we could also use plain integers in the sys crates for functions, etc and require explicit conversion there already. But that seems suboptimal API-wise.

from gir.

johncf avatar johncf commented on June 13, 2024

@sdroege Sounds okay...

I understand that __Nonexhaustive may be required for some enums, but is it really required for all of them? Perhaps I'm just being naive.

And regarding __Unknown(u32), why be so forgiving as to work with broken code? Why not just panic (or an ffi-safe abort mechanism)?

from gir.

sdroege avatar sdroege commented on June 13, 2024

I understand that __Nonexhaustive may be required for some enums, but is it really required for all of them? Perhaps I'm just being naive.

A configuration could be added to that, but generally the safe thing to do is to assume that the C library can add new values without breaking compatibility. Also this is how things are currently done already.

And regarding __Unknown(u32), why be so forgiving as to work with broken code? Why not just panic (or an ffi-safe abort mechanism)?

It's not necessarily broken code. There are completely valid scenarios when the C library starts returning values that are unknown at the Rust side, like when a new version of the C library adds a new value but the Rust side was not updated yet (and does not have to, the C library could do this change in a backwards compatible way).

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

@sdroege, 1) IMHO if we really need _Unknown(u32) better make FromGlib Result-like or add separate trait for it. Currently I don't see reason to do this.
2) Currently I prefer panic over returning __Nonexhaustive for bad values, as nobody will remember to check this value and it not rustish way, but agree that we can use __Nonexhaustive if panic don't acceptable.

from gir.

sdroege avatar sdroege commented on June 13, 2024

IMHO if we really need _Unknown(u32) better make FromGlib Result-like or add separate trait for it. Currently I don't see reason to do this.

It's not really needed, no, but then you would need to map to some other value (__Nonexhaustive?) or panic

Currently I prefer panic over returning __Nonexhaustive for bad values, as nobody will remember to check this value and it not rustish way, but agree that we can use __Nonexhaustive if panic don't acceptable.

Panic is a problem because it makes backwards compatible upgrades of the C library a breaking upgrade at the Rust level. And while it's true that nobody will check __Nonexhaustive directly, by just having it there you must have a _ case in your match statements and somehow handle "unknown" cases. (Which is how things are nowadays already).

Using __Nonexhaustive for unknown values is mostly so that no panic is needed. You need to use some value, and either some new one (__Unknown, __Unknown(u32)) has to be added or we just reuse/abuse that one.

from gir.

sdroege avatar sdroege commented on June 13, 2024

better make FromGlib Result-like or add separate trait for it

Also the existing FromGlib seems sufficient for that, you don't need (and don't want) to return a Result<,> from there.

In the end this is all no different from e.g. std::io::ErrorKind. New values can be added at any time, and your code needs to handle that in one way or another.

from gir.

johncf avatar johncf commented on June 13, 2024

I think it would be good if we could find any examples of such a "backwards-compatible" change occurring in the past. If no such examples could be found, it might be good to get the perspective from a core Gtk developer about their stance on the subject, before adding __Unknown(u32) variant.

And, ideally, we should not add __Nonexhaustive to enums that are known to not change in the future in a "backwards-compatible" way. But if that information is hard to gather, and there are instances of this happening, then this cannot be avoided.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

So alternative of direct reexport is this? (I use values instead ffi as converting in match cases don't works).

#[doc(hidden)]
impl FromGlib<ffi::GtkButtonRole> for ButtonRole {
    fn from_glib(value: ffi::GtkButtonRole) -> Self {
        skip_assert_initialized!();
        match value as i32 {
            0 => ButtonRole::Normal,
            1 => ButtonRole::Check,
            2 => ButtonRole::Radio,
            _ => ButtonRole::__Nonexhaustive(()),
        }
    }
}

from gir.

sdroege avatar sdroege commented on June 13, 2024

I think it would be good if we could find any examples of such a "backwards-compatible" change occurring in the past. If no such examples could be found, it might be good to get the perspective from a core Gtk developer about their stance on the subject, before adding __Unknown(u32) variant.

(Not core Gtk developer here, but GStreamer core developer and 10+ years GLib/GIO/GObject contributor)
It happens quite often (also see again my comment about std::io::ErrorKind, it also happens in Rust). One case I remember right now is https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gst-plugins-base-libs/html/gst-plugins-base-libs-GstVideoAlignment.html#GstVideoFormat (see the Since: 1.x markers at the bottom). I'm sure if you look close enough, you'll also find cases in GIO and GTK.

And, ideally, we should not add __Nonexhaustive to enums that are known to not change in the future in a "backwards-compatible" way. But if that information is hard to gather, and there are instances of this happening, then this cannot be avoided.

The problem is that you can't detect these cases automatically, so you have to assume by default that enums can be extended (as that is the more general case). Configuration could be added to override that on a per-enum basis. Note also that nowadays we add __Nonexhaustive to every enum on the Rust side, so keeping it that way wouldn't be a change.

So alternative of direct reexport is this? (I use values instead ffi as converting in match cases don't works).

The direct re-export gives you undefined behaviour. The way how it is done now should be safe at least (ignoring for now that repr(C) enums currently do not allow arbitrary values) and panics. What you suggest would be an improvement on that (note that you can just do __Nonexhaustive without the () unit/empty tuple). For debugging purposes (at least) I would add a __Unknown(u32) so that this information is not completely lost on translation.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024
  1. __Nonexhaustive without () allow cast to int that we want disallow (added in #251)
  2. Adding u32 or i32 to enum will increased it size, I prefer don't do it, but don't completely disagree

from gir.

sdroege avatar sdroege commented on June 13, 2024

__Nonexhaustive without () allow cast to int that we want disallow (added in #251)

Ah! Good to know :)

Adding u32 or i32 to enum will increased it size, I prefer don't do it, but don't completely disagree

Maybe behind some kind of "debug" feature?

from gir.

johncf avatar johncf commented on June 13, 2024

@sdroege Thanks for the example. I only now found out that all Gtk 3 versions ship shared objects with their major version set to 0 (e.g. libgtk-3.so.0.2200.15 for 3.22). Sorry I was unaware of this; I always assumed that the major version would be 22 in that case, which we can specify to the linker (which, when I thought about it, is not a very good idea). Since all Gtk3 versions are considered to have "fully backwards-compatible API," I realized that this is indeed unavoidable. (I guess things cannot always be as clean as you want it to be...)

from gir.

johncf avatar johncf commented on June 13, 2024

I was thinking about the implementation part of this, and it seems like it might be easier to define the sys enum types as u32 and the variants as simple u32 constants. Otherwise the generated conversion code in the safe side would have to use raw u32 constants in the match block (which matches a sys enum as u32-ed? This should probably be done only after Rust defines a behavior for invalid transmuted repr(C) enum values).

from gir.

johncf avatar johncf commented on June 13, 2024

A cleaner solution might be: rust-lang/rfcs#2008

But it might be a long wait.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Part of enums has negative values (ex. GtkTextBufferTargetInfo).
All enums in gtk-rs is convertible to i32.
So instead u32 we better use i32 and maybe add check for size it we found that it wrong.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

@johncf You agreed that the re-export would not go?

@sdroege So fixed version of enum, will look like this?

#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash)]
pub enum ButtonRole {
    Normal,
    Check,
    Radio,
    #[doc(hidden)]
    __Unknown(i32),
}

#[doc(hidden)]
impl ToGlib for ButtonRole {
    type GlibType = ffi::GtkButtonRole;

    fn to_glib(&self) -> ffi::GtkButtonRole {
        match *self {
            ButtonRole::Normal => ffi::GTK_BUTTON_ROLE_NORMAL,
            ButtonRole::Check => ffi::GTK_BUTTON_ROLE_CHECK,
            ButtonRole::Radio => ffi::GTK_BUTTON_ROLE_RADIO,
            ButtonRole::__Unknown(value) => unsafe{std::mem::transmute(value)},
        }
    }
}

#[doc(hidden)]
impl FromGlib<ffi::GtkButtonRole> for ButtonRole {
    fn from_glib(value: ffi::GtkButtonRole) -> Self {
        skip_assert_initialized!();
        match value as i32 {
            0 => ButtonRole::Normal,
            1 => ButtonRole::Check,
            2 => ButtonRole::Radio,
            value => ButtonRole::__Unknown(value),
        }
    }
}

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Note: size of enum changed from 1 to 8. But maybe in GUI application it really don't matter.

from gir.

sdroege avatar sdroege commented on June 13, 2024

@EPashkin That would be one option, yes. I think ideally we would not use enums in the sys crates, but just an integer of the appropriate size plus constants (see discussion in the nikomatsakis/rust-memory-model repository).

I'm not sure if i32 is always correct though, the rules for the underlying integer type of C enums is not trivial IIRC. And yours would fail for values > 2**31

from gir.

johncf avatar johncf commented on June 13, 2024

@johncf You agreed that the re-export would not go?

That's right... Unless rfc#2008 is implemented in a way that helps with this situation.

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

@sdroege Problem with integers that it will be allow to group in function calls. So #[repr(C)]enum will need anyway. We can change definition GTK_BUTTON_ROLE_NORMAL to be integer, but then it can't be used as function parameter.
We can add check on generating enum in normal mode to check it range,
but IMHO enums always used as gint (https://github.com/GNOME/glib/blob/master/gobject/gparamspecs.c#L2046)

from gir.

sdroege avatar sdroege commented on June 13, 2024

Enums in C are unfortunately not always "int", quoting the standard:

Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined,108) but shall be capable of representing the values of all the members of the enumeration.
[...]
An implementation may delay the choice of which integer type until all enumeration constants have been seen.

In practice, enums are "int" or "unsigned int" usually, depending on whether they contain signed values or values >= 2**31. Also I believe GI writes down enum values as signed integers, and bitfields (aka flags) as unsigned integers (but please check that before depending on it).

from gir.

EPashkin avatar EPashkin commented on June 13, 2024

Note: this maybe need reworking when rust-lang/rust-memory-model#41 fixed on rust side.

from gir.

sdroege avatar sdroege commented on June 13, 2024

Also note that this is still not 100% correct, see discussion in there. It should be fine in practice though.

from gir.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.