-
Notifications
You must be signed in to change notification settings - Fork 54
[Controller] POC: overall speedup #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
alxbilger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the cache system also relevant for other trampoline classes?
To my meager knowledge, I would say yes. But in this case it is especially useful because there were(are) many lookups to call implicitly all the *event() every step. |
Summary of Modifications
The changes in this branch (speedup_controller) optimize the Controller_Trampoline class in the Python bindings by adding a caching mechanism for Python method lookups:
Key Changes:
1. New caching infrastructure (in Binding_Controller.h):
- Added member variables to cache:
- m_pySelf - cached Python self reference (avoids repeated py::cast(this))
- m_methodCache - unordered_map storing Python method objects by name
- m_onEventMethod - cached fallback "onEvent" method
- m_hasOnEvent / m_cacheInitialized - state flags
2. New methods (in Binding_Controller.cpp):
- initializePythonCache() - initializes the cache on first use
- getCachedMethod() - retrieves methods from cache (or looks them up once and caches)
- callCachedMethod() - calls a cached Python method with an event
- Constructor and destructor to properly manage the cached Python objects with GIL
3. Optimized handleEvent():
- Previously: every event caused py::cast(this), py::hasattr(), and attr() lookups
- Now: uses cached method references, avoiding repeated Python attribute lookups
4. Optimized getClassName():
- Uses the cached m_pySelf when available instead of casting each time
Purpose:
This is a performance optimization that reduces overhead when handling frequent events (like AnimateBeginEvent, AnimateEndEvent), which can be called many times per simulation step. The caching eliminates repeated Python/C++ boundary crossings for method lookups.
57debb4 to
040d3e6
Compare
|
Nice cache mechanism. Now I wonder what would happend if someone changed dynamically the method after it has already been cached : MyController.onAnimateBeginEvent() # Cache is now active for this method
MyController.onAnimateBeginEvent = randomMethod # Cache is now invalid
MyController.onAnimateBeginEvent() # now what's called ? Maybe we could add a setattr overloead invalidating the cache when it is called on a method already cached ? |
I did detect this issue some time ago, about having a controller in a scene was slowing down the simulation so much, especially on macOS. Even if the controller do nothing. And it gets slower and slower more there are Controllers.
DISCLAIMER: this was mostly the work of Claude, which detected/suggested the issues (lookups were slowing down the simulation) and generated the solution.
I just did the benches/tests to make sure it works, but as for everything with sofapython3 and/or pybind11, I cannot prove everything is okay/well-done. So deep review from experts would be appreciated 🫠
In any case, the modifications lead to a dramatic speed up:
(refer to the scene with this PR, which creates an empy scene with a certain number of Controller doing nothing)
Ubuntu 22.04 (gcc12, i7 13700k)
--> with 10controllers, 150x faster... 😮
Windows (MSVC2026, i7 11800h)
--> with 10controllers, 200x faster... 😲
macOS (xcode26, M3 max)
--> with 10controllers, 837x faster... 🤪