Open
Conversation
Duration columns should be classified as 'duration' type, not 'datetime'. These tests verify the correct type detection, classification, and styling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…622) Duration columns were misclassified as 'datetime' because pl_typing_stats used is_temporal() which includes Duration. The datetime formatter then failed to parse duration values, returning empty strings. - Add is_timedelta flag to both pd and pl typing stats - Exclude pl.Duration from is_datetime in polars stats - Add 'duration' type in _type() classification - Style duration columns with string displayer Closes #622 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📦 TestPyPI package publishedpip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.12.12.dev23143269690or with uv: uv pip install --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo==0.12.12.dev23143269690MCP server for Claude Codeclaude mcp add buckaroo-table -- uvx --from "buckaroo[mcp]==0.12.12.dev23143269690" --index-strategy unsafe-best-match --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ buckaroo-table |
…val types Tests verify correct type detection, classification, styling, and full pipeline integration for all new type categories. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…od, interval - Polars: Categorical, Enum, Decimal, Binary, Time get dedicated types instead of falling to 'obj' or being misclassified - Pandas: CategoricalDtype, PeriodDtype, IntervalDtype detected - pl.Decimal no longer misclassified as 'integer' (is_numeric excluded) - pl.Time no longer misclassified as 'datetime' (was showing blank cells) - Styling: decimal→float displayer, categorical/time/period/interval→string, binary→obj Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ISO 8601 duration strings like 'P1DT2H3M4S' are now rendered as '1d 2h 3m 4s' instead of the raw ISO format. Handles microseconds, milliseconds, seconds, and mixed day/hour/minute/second values. - Add DurationDisplayerA interface and getDurationFormatter in JS - Wire 'duration' displayer into getFormatter switch - Update Python styling to use 'duration' displayer for duration type - Add JS unit tests for formatIsoDuration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The v1 TypingStats (used by BuckarooWidget/BuckarooInfiniteWidget) was missing is_timedelta, is_categorical, is_period, is_interval detection, so the new type classifications only worked in v2 pipelines. Also adds df_with_weird_types() and pl_df_with_weird_types() to ddd_library, plus a Jupyter notebook and marimo entries for testing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lization fastparquet can't handle PeriodDtype, IntervalDtype, or timedelta64 columns directly, causing the infinite widget's parquet data transfer to fail silently (all values showed as None). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
11 Playwright tests verify correct rendering of all new type displayers: - Duration: ISO 8601 → human-readable (1d 2h 3m 4s, 100µs, etc.) - Categorical, Period, Interval: string values display correctly - Time: HH:MM:SS format preserved - Decimal: float formatting with correct precision - Binary: string representation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_get_summary_sd was returning {} when any stat errored (e.g. histogram
for Decimal), discarding all stats and producing empty column configs.
Now returns partial results so the widget renders even with some errors.
Also fixes output_full_reproduce crash when kls is None (v2 stat funcs
have no v1 class), and removes non-UTF-8 bytes from DDD binary column.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Binary columns (e.g. pl.Binary) with non-ASCII bytes crash pd.to_json with UnicodeDecodeError. Now bytes values are converted to hex strings in both the JSON path (pd_to_obj) and parquet path (to_parquet). Restores non-ASCII test data in DDD pl_df_with_weird_types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stories now include summary_stats_data with histogram data and pinned rows config (dtype + histogram). Playwright tests verify: - histogram-component divs render in pinned area (>=3 for pandas, >=4 for polars) - dtype row shows correct type names (category, Int64, Categorical, etc.) - 15 tests total, all passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Durationcolumns were showing blank cells because they were misclassified asdatetime, causing the JS datetime formatter to failpl.Time(blank cells) andpl.Decimal(misclassified asinteger, losing precision)durationdisplayer that renders ISO 8601 durations as human-readable strings (e.g.P1DT2H3M4S→1d 2h 3m 4s,P0DT0H0M0.0001S→100µs)Changes
pl.Durationdatetime→ blank cellsduration→1d 2h 3m 4spl.Timedatetime→ blank cellstime→14:30:00pl.Decimalinteger→100(loses.50)decimal→100.500pl.Categorical/pl.Enumobjcategorical→redpl.Binaryobj(unchanged)binary→ explicit typepd.CategoricalDtypestring/objcategorical→ explicit typepd.PeriodDtypeobjperiod→2021-01pd.IntervalDtypeobjinterval→(0, 1]Files changed
pd_stats_v2.py,pl_stats_v2.py(type detection),styling.py(displayer mapping)DFWhole.ts(DurationDisplayerA type),Displayer.ts(formatIsoDuration + getDurationFormatter)test_pd_stats_v2.py,test_pl_stats_v2.py,gridUtils.test.tsTest plan
test_all_polars_types_classifiedverifies zero types fall toobjfor a 10-column mixed-type DataFrameCloses #622
🤖 Generated with Claude Code