Describe the bug
When trying to create a DataFrame from a pyarrow.Table object with a nonzero number of columns, but zero rows, I encounter a panic in src/context.rs:294.
To Reproduce
>>> import datafusion as df
>>> import pyarrow as pa
>>> ctx = df.SessionContext()
>>> import pandas as pd
>>> df = pd.DataFrame({'col': []})
>>> import pyarrow as pa
>>> emptyTable = pa.Table.from_pandas(df)
>>> emptyTable
pyarrow.Table
col: double
----
col: [[]]
>>> ctx.from_arrow_table(emptyTable)
thread '<unnamed>' panicked at src/context.rs:294:37:
index out of bounds: the len is 0 but the index is 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
pyo3_runtime.PanicException: index out of bounds: the len is 0 but the index is 0
Expected behavior
I expect this to create a DataFrame with zero rows, such as the following (created via .limit(0) from a non-empty DataFrame):
>>> empty
DataFrame()
++
++
>>> empty.describe()
DataFrame()
+------------+-----+
| describe | col |
+------------+-----+
| count | 0.0 |
| null_count | 0.0 |
| mean | |
| std | |
| min | |
| max | |
| median | |
+------------+-----+
Additional context
- Operating system: Rocky 8
- Python version: 3.10.4
- Python module versions used:
>>> df.__version__
'34.0.0'
>>> pa.__version__
'15.0.0'
>>> pd.__version__
'2.2.0'
Describe the bug
When trying to create a
DataFramefrom apyarrow.Tableobject with a nonzero number of columns, but zero rows, I encounter a panic insrc/context.rs:294.To Reproduce
Expected behavior
I expect this to create a
DataFramewith zero rows, such as the following (created via.limit(0)from a non-emptyDataFrame):Additional context