GH-112906: Fix performance regression in pathlib path initialisation#112907
GH-112906: Fix performance regression in pathlib path initialisation#112907barneygale merged 4 commits intopython:mainfrom
Conversation
…ation This was caused by 76929fd, specifically its use of `super()` and its packing/unpacking `*args`.
$ ./python -m timeit -n 1000000 -s "from pathlib import PurePath" "PurePath('a', 'b')"
1000000 loops, best of 5: 636 nsec per loop # before regression (baseline)
1000000 loops, best of 5: 827 nsec per loop # after regression (30% slower)
1000000 loops, best of 5: 662 nsec per loop # after this fix (4% slower)The last few % are hard to recoup unfortunately |
|
Skipping news because this only regressed yesterday. |
terryjreedy
left a comment
There was a problem hiding this comment.
Given that PurePath may be instantiated multiple times in an app, the speed recovery looks worthwhile. The net 5% loss is the cost of a new class.
The logic looks correct (and I presume testing verifies it); just add docstrings with minimal explanation. Without the explanation in the issue and PR, this could puzzle future readers.
|
When you're done making the requested changes, leave the comment: |
|
I have made the requested changes; please review again |
AlexWaygood
left a comment
There was a problem hiding this comment.
Have you considered either of these two alternatives? (Both diffs are relative to main, not this PR branch)
diff --git a/Lib/pathlib/__init__.py b/Lib/pathlib/__init__.py
index f4668ab327..626f817733 100644
--- a/Lib/pathlib/__init__.py
+++ b/Lib/pathlib/__init__.py
@@ -90,7 +90,9 @@ def __init__(self, *args):
"object where __fspath__ returns a str, "
f"not {type(path).__name__!r}")
paths.append(path)
- super().__init__(*paths)
+ # As an optimisation, avoid calling super().__init__() here
+ self._raw_paths = paths
+ self._resolving = Falseor
diff --git a/Lib/pathlib/__init__.py b/Lib/pathlib/__init__.py
index f4668ab327..5c605547c0 100644
--- a/Lib/pathlib/__init__.py
+++ b/Lib/pathlib/__init__.py
@@ -90,7 +90,7 @@ def __init__(self, *args):
"object where __fspath__ returns a str, "
f"not {type(path).__name__!r}")
paths.append(path)
- super().__init__(*paths)
+ super().__init__(paths)
def __reduce__(self):
# Using the parts tuple helps share interned path parts
diff --git a/Lib/pathlib/_abc.py b/Lib/pathlib/_abc.py
index 4808d0e61f..2d7fba392a 100644
--- a/Lib/pathlib/_abc.py
+++ b/Lib/pathlib/_abc.py
@@ -206,7 +206,7 @@ class PurePathBase:
)
pathmod = os.path
- def __init__(self, *paths):
+ def __init__(self, paths):
self._raw_paths = paths
self._resolving = FalseThe first is slightly faster than your PR here; the second is slightly slower.
These both feel like slightly simpler patches. The second might also provide speedups to third-party subclasses of PurePathBase as well as subclasses in the Lib/pathlib/ directory
|
Thanks Alex First alternative: it's been a while since I've dealt with this, but I think not calling Second alternative: would make the initialiser signature of |
|
I think I was wrong about the first alternative: it's fine to omit the |
|
Yeah, if you're looking to do cooperative multiple inheritance, it's always advisable to always call Outside of that situation, though, I don't think there's any reason to say you have to always call The cooperative multiple inheritance concern doesn't apply here either, I don't think, since
Yeah, good point! |
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
|
Thanks for the reviews, both! |
…ation (python#112907) This was caused by 76929fd, specifically its use of `super()` and its packing/unpacking `*args`. Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
…ation (python#112907) This was caused by 76929fd, specifically its use of `super()` and its packing/unpacking `*args`. Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This was caused by 76929fd, specifically its use of
super()and its packing/unpacking of*args.