Bug report
Bug description:
I've noticed a regression when adding 3.12 support to xdoctest.
The following MWE has different behavior on 3.11 and 3.12.
import tokenize
lines = ['3, 4]', 'print(len(x))']
iterable = (line for line in lines if line)
def _readline():
return next(iterable)
for t in tokenize.generate_tokens(_readline):
print(t)
On 3.11 and earlier versions this will result in a tokenize.TokenError being raised:
TokenInfo(type=2 (NUMBER), string='3', start=(1, 0), end=(1, 1), line='3, 4]')
TokenInfo(type=54 (OP), string=',', start=(1, 1), end=(1, 2), line='3, 4]')
TokenInfo(type=2 (NUMBER), string='4', start=(1, 3), end=(1, 4), line='3, 4]')
TokenInfo(type=54 (OP), string=']', start=(1, 4), end=(1, 5), line='3, 4]')
TokenInfo(type=1 (NAME), string='print', start=(2, 0), end=(2, 5), line='print(len(x))')
TokenInfo(type=54 (OP), string='(', start=(2, 5), end=(2, 6), line='print(len(x))')
TokenInfo(type=1 (NAME), string='len', start=(2, 6), end=(2, 9), line='print(len(x))')
TokenInfo(type=54 (OP), string='(', start=(2, 9), end=(2, 10), line='print(len(x))')
TokenInfo(type=1 (NAME), string='x', start=(2, 10), end=(2, 11), line='print(len(x))')
TokenInfo(type=54 (OP), string=')', start=(2, 11), end=(2, 12), line='print(len(x))')
TokenInfo(type=54 (OP), string=')', start=(2, 12), end=(2, 13), line='print(len(x))')
Traceback (most recent call last):
File "/home/joncrall/code/xdoctest/dev/tokenize_mwe.py", line 10, in <module>
for t in tokenize.generate_tokens(_readline):
File "/home/joncrall/.pyenv/versions/3.11.2/lib/python3.11/tokenize.py", line 525, in _tokenize
raise TokenError("EOF in multi-line statement", (lnum, 0))
tokenize.TokenError: ('EOF in multi-line statement', (3, 0))
However, on 3.12, this no longer raises an error:
Instead I get:
TokenInfo(type=2 (NUMBER), string='3', start=(1, 0), end=(1, 1), line='3, 4]')
TokenInfo(type=55 (OP), string=',', start=(1, 1), end=(1, 2), line='3, 4]')
TokenInfo(type=2 (NUMBER), string='4', start=(1, 3), end=(1, 4), line='3, 4]')
TokenInfo(type=55 (OP), string=']', start=(1, 4), end=(1, 5), line='3, 4]')
TokenInfo(type=4 (NEWLINE), string='', start=(1, 5), end=(1, 6), line='3, 4]')
TokenInfo(type=1 (NAME), string='print', start=(2, 0), end=(2, 5), line='print(len(x))')
TokenInfo(type=55 (OP), string='(', start=(2, 5), end=(2, 6), line='print(len(x))')
TokenInfo(type=1 (NAME), string='len', start=(2, 6), end=(2, 9), line='print(len(x))')
TokenInfo(type=55 (OP), string='(', start=(2, 9), end=(2, 10), line='print(len(x))')
TokenInfo(type=1 (NAME), string='x', start=(2, 10), end=(2, 11), line='print(len(x))')
TokenInfo(type=55 (OP), string=')', start=(2, 11), end=(2, 12), line='print(len(x))')
TokenInfo(type=55 (OP), string=')', start=(2, 12), end=(2, 13), line='print(len(x))')
TokenInfo(type=4 (NEWLINE), string='', start=(2, 13), end=(2, 14), line='print(len(x))')
TokenInfo(type=0 (ENDMARKER), string='', start=(3, 0), end=(3, 0), line='')
This is a problem for xdoctest because it uses tokenize to determine if a statement is "balanced" (i.e. if it is part of a line continuation or not). This is the magic I use to autodetect PS1 vs PS2 lines and prevent users from needing to manually specify if a line is a continuation or not.
Looking through the release and migration notes, I don't see anything that would indicate that this new behavior is introduced, so I suspect it is a bug. I'm sorry I didn't catch this before the 3.12 release. I've been busy.
If this is not a bug and an intended change, then it should be documented (please link to the relevant section if I missed it). If there is a way to work around this so xdoctest works on 3.12.0 that would be helpful. (It's probably time some of the parsing code got a rewrite anyway).
CPython versions tested on:
3.11, 3.12
Operating systems tested on:
Linux
Bug report
Bug description:
I've noticed a regression when adding 3.12 support to xdoctest.
The following MWE has different behavior on 3.11 and 3.12.
On 3.11 and earlier versions this will result in a tokenize.TokenError being raised:
However, on 3.12, this no longer raises an error:
Instead I get:
This is a problem for xdoctest because it uses tokenize to determine if a statement is "balanced" (i.e. if it is part of a line continuation or not). This is the magic I use to autodetect PS1 vs PS2 lines and prevent users from needing to manually specify if a line is a continuation or not.
Looking through the release and migration notes, I don't see anything that would indicate that this new behavior is introduced, so I suspect it is a bug. I'm sorry I didn't catch this before the 3.12 release. I've been busy.
If this is not a bug and an intended change, then it should be documented (please link to the relevant section if I missed it). If there is a way to work around this so xdoctest works on 3.12.0 that would be helpful. (It's probably time some of the parsing code got a rewrite anyway).
CPython versions tested on:
3.11, 3.12
Operating systems tested on:
Linux