-
Notifications
You must be signed in to change notification settings - Fork 4.1k
ARROW-7663: [Python] Raise better error message when passing mixed-type (int/string) Pandas dataframe to pyarrow Table #8044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
15a22d5
e59eed6
5985d49
80e0078
8f28160
70ae0a9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,7 +25,6 @@ | |
| import decimal | ||
| import itertools | ||
| import math | ||
| import traceback | ||
|
|
||
| import numpy as np | ||
| import pytz | ||
|
|
@@ -382,11 +381,8 @@ def test_sequence_custom_integers(seq): | |
| @parametrize_with_iterable_types | ||
| def test_broken_integers(seq): | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We lose the the more specific traceback and In [11]: class MyBrokenInt:
...: def __init__(self):
...: 1/0
In [12]: pa.array([MyBrokenInt()], type=pa.int64())
---------------------------------------------------------------------------
ArrowInvalid Traceback (most recent call last)
<ipython-input-12-1cf156b165b3> in <module>
----> 1 pa.array([MyBrokenInt()], type=pa.int64())
~/git_repo/arrow/python/pyarrow/array.pxi in pyarrow.lib.array()
269 else:
270 # ConvertPySequence does strict conversion if type is explicitly passed
--> 271 return _sequence_to_array(obj, mask, size, type, pool, c_from_pandas)
272
273
~/git_repo/arrow/python/pyarrow/array.pxi in pyarrow.lib._sequence_to_array()
38
39 with nogil:
---> 40 check_status(ConvertPySequence(sequence, mask, options, &out))
41
42 if out.get().num_chunks() == 1:
~/git_repo/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
82
83 if status.IsInvalid():
---> 84 raise ArrowInvalid(message)
85 elif status.IsIOError():
86 # Note: OSError constructor is
ArrowInvalid: Could not convert <__main__.MyBrokenInt object at 0x7fc331394290> with type MyBrokenInt: tried to convert to intbut this is the same message as what we get on master for In [11]: class MyBrokenInt:
...: def __init__(self):
...: 1/1 so maybe it's ok?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that is fine, personally |
||
| data = [MyBrokenInt()] | ||
| with pytest.raises(ZeroDivisionError) as exc_info: | ||
| with pytest.raises(pa.ArrowInvalid): | ||
| pa.array(seq(data), type=pa.int64()) | ||
| # Original traceback is kept | ||
| tb_lines = traceback.format_tb(exc_info.tb) | ||
| assert "# MARKER" in tb_lines[-1] | ||
|
|
||
|
|
||
| def test_numpy_scalars_mixed_type(): | ||
|
|
@@ -1643,7 +1639,7 @@ def test_map_from_dicts(): | |
|
|
||
| # Invalid dictionary types | ||
| for entry in [[{'key': '1', 'value': 5}], [{'key': {'value': 2}}]]: | ||
| with pytest.raises(TypeError, match="integer is required"): | ||
| with pytest.raises(pa.ArrowInvalid, match="tried to convert to int"): | ||
| pa.array([entry], type=pa.map_('i4', 'i4')) | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the cases that this couldn't be converted, but that
objis an integer? When the integer is too big to fit in a C int?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and also when converting a negative integer to a
uint:No other tests are touched if I recompile without this check