Skip to content

fix: display Polars Duration columns (#622)#628

Open
paddymul wants to merge 12 commits intomainfrom
fix/duration-column-display
Open

fix: display Polars Duration columns (#622)#628
paddymul wants to merge 12 commits intomainfrom
fix/duration-column-display

Conversation

@paddymul
Copy link
Collaborator

@paddymul paddymul commented Mar 16, 2026

Summary

  • Polars Duration columns were showing blank cells because they were misclassified as datetime, causing the JS datetime formatter to fail
  • Same issue affected pl.Time (blank cells) and pl.Decimal (misclassified as integer, losing precision)
  • Added dedicated type detection and classification for: categorical, decimal, binary, time, period, interval, duration
  • Added a new JS duration displayer that renders ISO 8601 durations as human-readable strings (e.g. P1DT2H3M4S1d 2h 3m 4s, P0DT0H0M0.0001S100µs)

Changes

Type Before After
pl.Duration datetime → blank cells duration1d 2h 3m 4s
pl.Time datetime → blank cells time14:30:00
pl.Decimal integer100 (loses .50) decimal100.500
pl.Categorical / pl.Enum obj categoricalred
pl.Binary obj (unchanged) binary → explicit type
pd.CategoricalDtype string/obj categorical → explicit type
pd.PeriodDtype obj period2021-01
pd.IntervalDtype obj interval(0, 1]

Files changed

  • Python: pd_stats_v2.py, pl_stats_v2.py (type detection), styling.py (displayer mapping)
  • JavaScript: DFWhole.ts (DurationDisplayerA type), Displayer.ts (formatIsoDuration + getDurationFormatter)
  • Tests: test_pd_stats_v2.py, test_pl_stats_v2.py, gridUtils.test.ts

Test plan

  • 101 Python stat tests pass (75 original + 26 new)
  • 83 JS tests pass (81 original + 2 new duration formatter tests)
  • Full Python unit suite: 699 passed
  • JS build succeeds
  • test_all_polars_types_classified verifies zero types fall to obj for a 10-column mixed-type DataFrame

Closes #622

🤖 Generated with Claude Code

paddymul and others added 2 commits March 15, 2026 20:53
Duration columns should be classified as 'duration' type, not 'datetime'.
These tests verify the correct type detection, classification, and styling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…622)

Duration columns were misclassified as 'datetime' because
pl_typing_stats used is_temporal() which includes Duration.
The datetime formatter then failed to parse duration values,
returning empty strings.

- Add is_timedelta flag to both pd and pl typing stats
- Exclude pl.Duration from is_datetime in polars stats
- Add 'duration' type in _type() classification
- Style duration columns with string displayer

Closes #622

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 16, 2026

📦 TestPyPI package published

pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.12.12.dev23143269690

or with uv:

uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.12.12.dev23143269690

MCP server for Claude Code

claude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.12.12.dev23143269690" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table

paddymul and others added 2 commits March 15, 2026 21:09
…val types

Tests verify correct type detection, classification, styling, and
full pipeline integration for all new type categories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…od, interval

- Polars: Categorical, Enum, Decimal, Binary, Time get dedicated types
  instead of falling to 'obj' or being misclassified
- Pandas: CategoricalDtype, PeriodDtype, IntervalDtype detected
- pl.Decimal no longer misclassified as 'integer' (is_numeric excluded)
- pl.Time no longer misclassified as 'datetime' (was showing blank cells)
- Styling: decimal→float displayer, categorical/time/period/interval→string,
  binary→obj

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ISO 8601 duration strings like 'P1DT2H3M4S' are now rendered as
'1d 2h 3m 4s' instead of the raw ISO format. Handles microseconds,
milliseconds, seconds, and mixed day/hour/minute/second values.

- Add DurationDisplayerA interface and getDurationFormatter in JS
- Wire 'duration' displayer into getFormatter switch
- Update Python styling to use 'duration' displayer for duration type
- Add JS unit tests for formatIsoDuration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The v1 TypingStats (used by BuckarooWidget/BuckarooInfiniteWidget) was
missing is_timedelta, is_categorical, is_period, is_interval detection,
so the new type classifications only worked in v2 pipelines.

Also adds df_with_weird_types() and pl_df_with_weird_types() to
ddd_library, plus a Jupyter notebook and marimo entries for testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lization

fastparquet can't handle PeriodDtype, IntervalDtype, or timedelta64
columns directly, causing the infinite widget's parquet data transfer
to fail silently (all values showed as None).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
paddymul and others added 3 commits March 15, 2026 23:30
11 Playwright tests verify correct rendering of all new type displayers:
- Duration: ISO 8601 → human-readable (1d 2h 3m 4s, 100µs, etc.)
- Categorical, Period, Interval: string values display correctly
- Time: HH:MM:SS format preserved
- Decimal: float formatting with correct precision
- Binary: string representation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_get_summary_sd was returning {} when any stat errored (e.g. histogram
for Decimal), discarding all stats and producing empty column configs.
Now returns partial results so the widget renders even with some errors.

Also fixes output_full_reproduce crash when kls is None (v2 stat funcs
have no v1 class), and removes non-UTF-8 bytes from DDD binary column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Binary columns (e.g. pl.Binary) with non-ASCII bytes crash pd.to_json
with UnicodeDecodeError. Now bytes values are converted to hex strings
in both the JSON path (pd_to_obj) and parquet path (to_parquet).

Restores non-ASCII test data in DDD pl_df_with_weird_types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stories now include summary_stats_data with histogram data and pinned
rows config (dtype + histogram). Playwright tests verify:
- histogram-component divs render in pinned area (>=3 for pandas, >=4 for polars)
- dtype row shows correct type names (category, Int64, Categorical, etc.)
- 15 tests total, all passing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

duration column not shown in PolarsBuckarooWidget

1 participant