-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design for Rust object by-value passing to C++ and back #1185
Comments
Any news on this ? This seems promising. |
Sorry, still waiting on @dtolnay to review the two PRs I posted regarding C++ exception handling (#1180) and type aliases (#1180). I cannot move forward on this until at least C++ exception handling is merged, since our implementation has dependencies and thus I can't publish further PRs for review. Also, I'm not willing to put any more work towards this crate upfront, until I get feedback on the direction, both on exception handling on the one side, and on this design on the other side. Similarly, I won't publish a "fork" of this crate, since I don't want to create divergence - I strongly prefer we put everything together upstream. @dtolnay If you are unable or unwilling to review these changes and designs, please designate someone else, who is able and willing to do so, so we can move forward here. I have several colleagues who are willing to review (and in fact, did the review already on our private fork). Unfortunately, currently, you seem to be the sole maintainer of this crate. If you'd be willing to allow others to contribute reviews and merge changes, it would be very welcome. |
This sounds interesting. Can you expand on what is generated on the C++ side? What do you mean by "Reserve aligned space in the C++ object for the type"? |
On the C++ side, an opaque object with character array of the required size and alignment is generated, which can be moved/copied/destroyed as needed based on Current requirement for non- |
I'm not sure I understand the need for the |
Yes. That was much simpler to implement. You need to take into consideration that we also need to support references to Passing opaque objects as-is is not sufficient due to C++ move semantics. A destructor is namely called on moved objects in C++ as well (i.e., you get two destructor calls). So the only way to get this right was to somehow mark the object as moved. This is done by using Similarly, it's not sufficient to mark the object by extending it by a flag on the C++ side, since we can also pass a reference to a Rust object to C++, which doesn't have this flag. So the size on the C++ side must be the same. Again, |
I understand the need to use an I'm wondering why you can't first move the |
That is possible. However, then, you'd need to have two distinct types on the C++ side - one for the opaque objects sent to C++ by value (which are wrapped in So the easiest way was to ensure they have the same size. Regarding layout guarantees, it is pretty much guaranteed, since an |
This seems reasonable (to both points). |
@schreter I still don't get why you need to do the For supporting references you just need to ensure that the backing data field is stored at the offset 0 of the class, which will happen if you put the data as the first field (Zngur doesn't do that, for some reasons it uses a dedicated |
@HKalbasi Interesting idea with forcing all C++ references to use a However, the I think I'll take a look at your crate. Since we are now completely revamping our C++ API wrapping the Rust code anyway (we expose a Rust component to C++), maybe it's a better option to use an actively developed crate instead of the one where the maintainer doesn't react. |
In our project which requires perfect C++/Rust integration (where we actually consume a Rust library in a large C++ project) we extended
cxx
crate to handle Rust by-value types, so we can safely give an instance of a Rust object to C++ (and back from C++ to Rust). Now we'd like to upstream the change, so everyone can profit.The major issue is the unknown type layout on the C++ side. The parser in
cxx::bridge
simply cannot know the size and alignment of the data type and also the traits the type implements. Our solution is pretty simple:#[repr(layout(<size>, <alignment>))]
on theextern "Rust"
type, where the size and alignment are checked at compile-time that they are indeed at-least for the type (at-least because the bridge can target multiple platforms with different pointer sizes, one can then pick the maximum size/alignment for now).#[derive(Copy)]
and/or#[derive(Clone)]
traits on the type (which are then checked at compile-time to ensure that the original object indeed has them). DerivingDefault
would be also easily possible.The data stored in the reserved space in the C++ object is then either
T
orOption<T>
on the Rust side, depending on whether the Rust object implementsCopy
or not.Copy
is implemented, then there is no issue whatsoever, since Rust'sCopy
objects can be copied/moved freely, so the data is just memcpy'ed as needed on the C++ side.Copy
is not implemented, then the data type isOption<T>
anddrop
,forget
and optionallyclone
(for types implementingClone
) callbacks are generated.The reasoning behind
Option<T>
is the following: C++ doesn't have a good notion of object ownership. If an object is moved to another location via move constructor or move assignment, the original object will still be destroyed by its destructor. Callingdrop
on this moved-out object would be fatal. Therefore, the object is represented byOption<T>
and moving out of the object will callforget
callback on it, which writesOption<T>::None
pattern into it. Following destruction of the object viadrop
callback will still call drop, but onNone
pattern, which is a no-op, thus it's safe.When passing objects by-value from Rust to C++, they are wrapped in
Option<T>::Some
. When returning them back to Rust, theOption
is unwrapped, so even if someone tries to return back a moved-out object to Rust (which is UB), we'll detect it. Similarly, passing references or pointers toT
from C++ to Rust will effectively entail passing references or pointers toOption<T>
, which can be also checked for moved-out objects (which is still UB, but better to report it). We didn't implement that yet.Another limitation in our implementation is also that the
T
andOption<T>
are required to have the same size (i.e.,Option<T>
must use some niche or some invalid pattern to representNone
). The reasoning is that this is typically anyway the case for all practical purposes (since we often want to pass handle types, which contain someArc
or the like) and it's fairly easy to add a member with a niche if needed. On the plus side, the binary representation/layout is then exactly same forT
andOption<T>::Some
, so passing references to C++ is also well-defined - simply pass the reference as-is.There is one danger, though. Passing a mutable reference to C++ would allow moving out of the object on the C++ side. Again, this would be UB from our PoV, but C++ doesn't care. We can check this, however, after the C++ call returns. The binary pattern of the object passed by reference must not correspond to
None
pattern. With that, we can also catch UB for this (i.e., C++ side reinterpreting the mutable reference as an rvalue and moving out of the object). Similarly, if rvalue references would be allowed, then this can be implemented by using&mut Option<T>
as a parameter on the Rust side, which would then correspond to rvalue reference on the C++ side. We didn't implement it, but it would be possible. This would also help addressing #561 trivially.Other issue which could be addressed fairly easily would be #251. Maybe it would help a bit with issue #171 (by providing
Option
support).Any comments/ideas on the aforementioned design?
As mentioned, we'd like to upstream the changes, which already exist, but since picking the right subset is not trivial, I'd like to clarify at least the minimal interface and minimal useful feature set.
Thanks.
The text was updated successfully, but these errors were encountered: