Skip to content

Support ownership transfer for unique_ptr<T> (move-only holders) from Python to C++ (and back) #1132

@EricCousineau-TRI

Description

@EricCousineau-TRI

I had reviewed some of the prior questions on GitHub, StackOverflow, etc, and found that accepting a unique_ptr<T> from Python is not presently possible (ref: one of the GitHub issues).

I think that this can be done, and have made a (rough, but not too rough) prototype of doing so that I believe is generally comprehensive for most of the simple edge cases.

branch | head commit (as of writing)

Demo Usage

Code sampled from this pseudo-test case.

class Base {
 public:
  Base(int value) : value_(value) {}
  virtual ~Base() {}
  virtual int value() const { return value_; }
 private:
  int value_{};
};

// Trampoline class.
class PyBase : public py::trampoline<Base> {
 public:
  typedef py::trampoline<Base> TBase;
  using TBase::TBase;
  int value() const override { PYBIND11_OVERLOAD(int, Base, value); }
};

// Make a terminal function.
void terminal_func(unique_ptr<Base> ptr) {
  cout << "Value: " << ptr->value() << endl;
  ptr.reset();  // This will destroy the instance.
  cout << "Destroyed in C++" << endl;
}

// Declare class.
py::class_<Base, PyBase>(m, "Base")
  .def(py::init<int>())
  .def("value", &Base::value);
m.def("terminal_func", &terminal_func);

// Extend and use it in some Python code.
py::exec(R"(
class PyExtBase(m.Base):
    def __init__(self, value):
        m.Base.__init__(self, value)
    def __del__(self):
        print("PyExtBase.__del__")
    def value(self):
        return 10 * m.Base.value(self)

m.terminal_func([m.Base(20)])
)");

Should print something like:

Value: 200
PyExtBase.__del__
Destroyed in C++

Functionality Overview

  • This extends move_only_holder_caster to load values. If a Python value is loaded into a unique_ptr<T>, the object is "released" to C++ (see here):
    • The existing reference must be unique (similar to pybind11::move<>).
    • The existing reference must be wrapped in a Python "move container".
      • If an object is returned from a pure Python function written in C++, then we can cast directly from this object.
      • However, if we wish to load / cast input arguments for bound C++ objects, things get hairy, as Jason alluded to here.
      • Using the container permits the desired instance to be wrapped into a referencable container, so that pybind11 can yank the instance out without disturbing the arguments (possible implementation).
        • A custom "move" container could be used (example), but it'd be easier API-wise to just use a single-item list (e.g. a built-in type).
      • While the casting interface could support both direct object references and container references, that inconsitency could be frustrating (at least for me).
        • Also, the implentation got ugly. There has to be an explicit callout in pybind11::move<>() for it to work (see commented-out code).
    • Given the "load type" (the circumstances used in type_caster_generic::load_impl<>()), use the type's "release" mechanism):
      • If the object is a pure C++ object, we can simply transfer ownership to the compatible holder, and turn the existing pybind instance into a non-owned instance, and let it destruct once its reference count depletes at the end of the caster.
      • If the object is a Python-derived C++ object, we check if the instance is derived from a "trampoline wrapper" type (pybind11::trampoline in the prototype).
        • If it can be wrapped, we first transfer the C++ base object to the external holder. Then, we take the Python bits, and place a pybind11::object reference to it in the trampoline wrapper.
        • This is done by finding the lowest-level type, and attempting to use the wrapper at this level.
          • This should be fine, as users shouldn't be extending the trampoline classes themselves (e.g., they should use the CRTP example from the docs if need be).
      • If the object has multiple inheritance... I didn't really care because I don't have a use case myself. Willing to think about it if the need arises.
    • When the object is "released", it will leave behind a function pointer to the corresponding "reclaim" method.
  • For "reclaiming" the object back (from C++ to Python), type_caster_generic::cast() could have a new clause:
    • If the following are true:
      • The instance already exists (which should only be the case if it was a derived Python object)
      • The return value policy is set up to take ownership
      • There is only one reference to this instance in Python (just for security)
    • Then the object is released back to Python by reclaiming owernship of the object that the trampoline has.
      • (As mentioned above, this shouldn't be called for pure C++ objects.)

Caveats / Limitations

  • Kinda ugly to have to wrap stuff in the "move containers".
    • Thoughts: Eh, fine by me if it works.
  • Using the trampoline wrapper affects destructor order if the object is destroyed in C++. See note here with a workaround.
    • Thoughts: Once again, eh. It works, and people shouldn't really be doing this anywho, but there is a simple workaround if they want to run with scissors.
  • Weak references will have an inconsistent workflow:
    • If the type is Python-derived, then weak references will still work if transfer ownership to C++.
    • However, if the type is pure C++, then weak references will be invalidated after transferring ownership.
    • Thoughts: Also eh. If there is a list of registered weak references accessible, the non-owned reference lifetime can be tied via keep_alive to the weak reference (and this tie can be broken if ownership is transferred back).
  • Cyclic references across language barriers will clearly pose an issue.
    • The user should be careful. This may be tied with desired weak_ref workflows.
    • Thoughts: Meh as well. User beware.
  • Present PR is only supported in the default_holder case. Unless someone actually has their own custom move-only holder, I'd prefer to keep it this way for simplicity's sake.
  • Has some modifications to internals and instance. These could be reduced some, if need be, but there'd need to be at least one field.
  • (Since this is not on CI as it's not a PR: Unittests compile, but some fail at run-time. Will check on those issues.)

Followup

I would really like to have this feature, so please do let me know if this seems like viable feature option. I am more than happy to submit a PR and go through the review process if so desired (which would be a much more polished, properly unit-tested version of the prototype branch).

\cc @jagerman

Metadata

Metadata

Assignees

No one assigned

    Labels

    holdersIssues and PRs regarding holders

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions