bpo-37884: Optimize Fraction() and statistics.mean()#15329
bpo-37884: Optimize Fraction() and statistics.mean()#15329serhiy-storchaka wants to merge 5 commits intopython:masterfrom
Conversation
| self._numerator, self._denominator = math._as_integer_ratio(numerator) | ||
| return self | ||
| except TypeError: | ||
| raise TypeError("argument should be a string or a number, " |
There was a problem hiding this comment.
Since the name of the encompassing type is "numeric" rather than "number", can we adjust the error message?
| raise TypeError("argument should be a string or a number, " | |
| raise TypeError("argument type should be str or numeric, " |
Source: https://docs.python.org/3.9/library/stdtypes.html#numeric-types-int-float-complex
There was a problem hiding this comment.
But an instance of a numeric type is a number, is not?
And if use "argument type", it should be "str", not "string".
There was a problem hiding this comment.
But an instance of a numeric type is a number, is not?
It is, but I think that it's a bit more useful to users to specify the actual types in this case since it's a TypeError. Also, when searching the docs for "number", the relevant documentation page ("Built-in Types") does not come up as a suggestion, instead they'll likely encounter the page for the "numbers" module (which would not be relevant to the error). When searching for "numeric", more relevant results are found, including "Built-in Types".
And if use "argument type", it should be "str", not "string".
I'll update the suggestion accordingly.
There was a problem hiding this comment.
This is a standard message used in many sites. For example:
>>> float([])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: float() argument must be a string or a number, not 'list'
If you want to change it, please open a separate issue.
There was a problem hiding this comment.
If you want to change it, please open a separate issue.
Alright, I'll take a look at what other areas use this error message and consider whether or not it should be addressed. That would definitely go outside of the scope of this PR.
|
|
||
| Return integer ratio. | ||
|
|
||
| Return a pair of integers, whose ratio is exactly equal to the original |
There was a problem hiding this comment.
Can we specify that the pair of integers returned are contained within a tuple?
| Return a pair of integers, whose ratio is exactly equal to the original | |
| Return a tuple containing a pair of integers, whose ratio is exactly equal to the original |
There was a problem hiding this comment.
This was copied from the docstring of float.as_integer_ratio. Other as_integer_ratio methods have similar wording.
There was a problem hiding this comment.
This suggestion was based on the docstring Raymond recently created for Fraction.as_integer_ratio(), I made a similar suggestion that he added to the PR.
If it would be helpful, I could create a separate PR to make a similar change to float.as_integer_ratio().
There was a problem hiding this comment.
The tuple is the pair, so I find "tuple containing a pair" a bit confusing. That almost sounds like ((numerator, denominator),) or something. Why not be explicit and write "a tuple (numerator, denominator)"?
There was a problem hiding this comment.
Why not be explicit and write "a tuple (numerator, denominator)"?
That would be an improvement. I mostly just wanted to specify that the function returned a tuple.
There was a problem hiding this comment.
"Pair" is a synonym of 2-element tuple. If you think that this term is incorrect, please open a separate issue and analyze all uses of in in the code and in the documentation (there are a lot of occurrences).
There was a problem hiding this comment.
"Pair" is a synonym of 2-element tuple.
Either is probably fine, I just figured it was worth trying to be more technically descriptive. Thanks.
|
Am I right that this is the same as #15210 but with a private function instead? |
There was a problem hiding this comment.
I performed the same tests mentioned in the bpo issue to verify the results and compared between the latest commit to master (master:24fe46081b) to the PR branch (math-as_integer_ratio2:e658d19083).
OS: Arch Linux 5.2.8
CPU: Intel i5-4460
./python -m timeit -s "from fractions import Fraction as F" "F(123)"
200000 loops, best of 5
Master: 1.42 usec
PR: 1.55 usec
./python -m timeit -s "from fractions import Fraction as F" "F(1.23)"
100000 loops, best of 5
Master: 2.92 usec per loop
PR: 2.14 usec
./python -m timeit -s "from fractions import Fraction as F; f = F(22, 7)" "F(f)"
100000 loops, best of 5
Master: 2.47 usec
PR: 1.93 usec
./python -m timeit -s "from statistics import mean; a = [1]*1000" "mean(a)"
500 loops, best of 5
Master: 930 usec
PR: 640 usec
./python -m timeit -s "from statistics import mean; a = [1.23]*1000" "mean(a)"
200 loops, best of 5
Master: 1.31 msec
PR: 1.34 msec
./python -m timeit -s "from statistics import mean; from fractions import Fraction as F; a = [F(22, 7)]*1000" "mean(a)"
200 loops, best of 5
Master: 1.31 msec
PR: 1.09 msec
./python -m timeit -s "from statistics import mean; from decimal import Decimal as D; a = [D('1.23')]*1000" "mean(a)"
100 loops, best of 5
Master: 2.08 msec
PR: 2 msec
It also looks like the Travis doctest and docbuild was failing on the other PR. As far as functionality goes, the changes to the code in |
jdemeyer
left a comment
There was a problem hiding this comment.
Supporting arbitrary objects with an as_integer_ratio method is a non-trivial change and should be documented in a NEWS entry.
Yes, it is.
Agreed. |
| return PyTuple_Pack(2, x, _PyLong_One); | ||
| } | ||
|
|
||
| if (_PyObject_LookupAttrId(x, &PyId_as_integer_ratio, &as_integer_ratio) < 0) { |
There was a problem hiding this comment.
Based on the context, I can roughly tell what this conditional is doing. From my understanding, _PyObject_LookupAttrId() is assessing whether or not the PyObject x contains an as_integer_ratio attribute. If a value less than zero is returned (usually -1), it does not contain that attribute.
However, I'm not certain that I understand where PyId_as_integer_ratio is coming from or how PyId actually works. I was unable to find any documentation on PyId or _Py_IDENTIFIER(), so I'm guessing it's an internal part of the C-API (since it's prefixed with an underscore).
My best guess is that a reference to PyId_as_integer_ratio was created when _Py_IDENTIFIER(as_integer_ratio) was used.
I'm fairly new to the C-API, so I'm trying to learn more about it so that I can be more helpful in PR reviews that involve it. Particularly the internal implementation details that aren't in the documentation.
There was a problem hiding this comment.
It is described in the header: Include/cpython/object.h.
There was a problem hiding this comment.
Ah, thanks for letting me know where to look. The code comments there addressed my question:
PyId_foo is a static variable, either on block level or file level. On first usage, the string "foo" is interned, and the structures are linked.
https://bugs.python.org/issue37884