Skip to content

refactor matches to be returned as shared pointers#298

Merged
hendrikmuhs merged 23 commits intoKeyviDev:masterfrom
hendrikmuhs:match-shared-ptr
Jun 3, 2024
Merged

refactor matches to be returned as shared pointers#298
hendrikmuhs merged 23 commits intoKeyviDev:masterfrom
hendrikmuhs:match-shared-ptr

Conversation

@hendrikmuhs
Copy link
Contributor

@hendrikmuhs hendrikmuhs commented Apr 20, 2024

refactor matches to be returned as shared pointers with the intention to further modernize the code, remove some quirks with copy/assignment and avoid copying match objects in various places.

This avoids 1 deep-copy for every match in the python extension as well as copies when creating the match iterator.

@hendrikmuhs hendrikmuhs marked this pull request as ready for review May 25, 2024 06:12

// de-duplicate
while (data->last_result.GetMatchedString() == data->results.back().GetMatchedString()) {
while (data->last_result && data->last_result->GetMatchedString() == data->results.back()->GetMatchedString()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have slightly different conventions: here you explicitly check for == nullptr but now you're just using operator bool for last_result. Since obj in the first case and last_result in this case are both of type match_t, I'm not sure whether there's a reason you're doing it different ways.

Copy link
Member

@narekgharibyan narekgharibyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a quick pass, and I see that in most of cases std::move is used, which makes me think that almost always we have single ownership instead of shared ownership.

So maybe we better use unique_ptr<Match> to enforce this semantics ?

Or we really need the shared_ptr<> ?

fwiw, unique_ptr<> is more lightweight compared to shared_ptr<>

@hendrikmuhs
Copy link
Contributor Author

Had a quick pass, and I see that in most of cases std::move is used, which makes me think that almost always we have single ownership instead of shared ownership.

So maybe we better use unique_ptr<Match> to enforce this semantics ?

We have single ownership within keyvi everywhere. However I think this is more a question for the client that is using it:

  • C++: not aware of anyone using it directly
  • rust/C: wrapped in another struct, so doesn't matter
  • python: wrapped in an object using a shared_ptr (autowrap)

Or we really need the shared_ptr<> ?

Conceptually no, unique_ptr would do and you can turn(move) a unique_ptr into a shared_ptr pointer any time if you need it.

However autowrap(python) heavily uses shared_ptr and does not support unique_ptr. That means extra work is required to make autowrap work again. It doesn't seem like a big problem. I will give it a try to see if there are more obstacles than I see at the moment.

@hendrikmuhs
Copy link
Contributor Author

I tried unique_ptr, it does not work:

.../keyvi/include/keyvi/dictionary/match_iterator.h:56:7: note: ‘keyvi::dictionary::MatchIterator::MatchIterator(const keyvi::dictionary::MatchIterator&)’ is implicitly deleted because the default definition would be ill-formed:
   56 | class MatchIterator : public boost::iterator_facade<MatchIterator, match_t const, boost::single_pass_traversal_tag> {
      |       ^~~~~~~~~~~~~
.../keyvi/include/keyvi/dictionary/match_iterator.h:56:7: error: use of deleted function ‘std::unique_ptr<_Tp, _Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = keyvi::dictionary::Match; _Dp = std::default_delete<keyvi::dictionary::Match>]’
In file included from /usr/include/c++/11/memory:76,
                 from .../keyvi/include/keyvi/dictionary/completion/multiword_completion.h:28,
                 from .../keyvi/bin/keyvi_c/c_api.cpp:30:

With other words: the match iterator requires:

unique_ptr( const unique_ptr& ) - which is by design deleted for unique_ptr, but not for shared_ptr

The above is needed when accessing the current object in an iterator.

@narekgharibyan
Copy link
Member

Okie-dokie, thanks @hendrikmuhs for giving it a try ...

With other words: the match iterator requires:
unique_ptr( const unique_ptr& ) - which is by design deleted for unique_ptr, but not for shared_ptr

So we are "sharing" the match_t object as far as the MatchIterator is concerned ?

Not sure if we have to, prob we can overcome this obstacle later on if we want to.

Another alternative idea I had is to not use pointers at all, and just delete copy constructor/assignment for Match and just let it be moved ... but again this is something for the future.

Again, thanks for giving it a try.

@hendrikmuhs
Copy link
Contributor Author

Another alternative idea I had is to not use pointers at all, and just delete copy constructor/assignment for Match and just let it be moved ...

That's what I tried 1st and I couldn't overcome some compile errors. As far as I remember similar issues regarding the match iterator.

@narekgharibyan
Copy link
Member

Does it stem from boost::iterator_facade<> implementation ?

As if I think about it, theoretically at least we should not need the copy, we should be able to move it ... as it is just a one off data transfer object, or I'm missing something ? 🤔

@hendrikmuhs
Copy link
Contributor Author

hendrikmuhs commented Jun 3, 2024

I don't remember all the details, boost::iterator_facade<> was definitely part of it.

@narekgharibyan
Copy link
Member

Okie, thanks. ... will keep this in mind.

@hendrikmuhs hendrikmuhs merged commit fa79e10 into KeyviDev:master Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants