Skip to content

S2 ROI ingest can produce a non-monotonic time axis when a dropped/older date is backfilled #60

Description

@eschechter

Summary

write_dataset is append-only: when the store already exists it always uses {"mode": "a", "append_dim": "time"} (storage/zarr_store.py:480). There is no insertion logic — a new date is concatenated onto the end of the time axis regardless of its chronological position. The doy attr is likewise appended positionally (zarr_store.py:840), tracking the same physical order.

This means any out-of-order arrival leaves the store's time axis non-monotonic.

How it bites

Concrete path (the one that surfaced this):

  1. ingest_s2_roi_reflectance processes a broad date range. A date is skipped — e.g. an asset-incomplete STAC item causes the SCL load to be dropped in _compute_scl_phase (see the No such band/alias handling in ingest/s2_roi.py). Later dates are written normally.
  2. The skipped date is never written, so it never enters get_existing_dates.
  3. On a later rerun (e.g. after the upstream item is reprocessed into a complete one), query_stac_items(existing_dates=...) filters out the already-written later dates but keeps the previously-skipped earlier date, which then gets appended at the end.

Result, e.g.:

2025-07-12, 2025-07-14, 2025-07-11   ← out of chronological order

This is not unique to the dropped-date case — any backfill of an older date into an existing store produces it.

Why it's a problem

  • Consumers that assume a monotonic time axis break: positional/label slicing (.sel(time=slice(...)) on a non-monotonic index raises in xarray/pandas), and resolve_region's contiguity check explicitly rejects an out-of-order axis (zarr_store.py:659-661).
  • Label-based selection and get_existing_dates (set-valued) are unaffected, so the corruption is silent until something order-sensitive runs.

Proposed fix

Preferred: make the append path fail fast rather than silently append out of order — reject a write whose new time coordinate(s) are <= the store's current max time, with a clear error pointing at this issue. Appending strictly-increasing dates stays the fast path; backfills require an explicit ordered-insert/rewrite path (note resolve_region is overwrite-only and cannot grow the axis).

Alternative (weaker): sort-on-read in consumers — does not fix the on-disk invariant and is easy to forget.

Context

Found while debugging an S2 ROI ingest failure on conus_corn_sample_49 for 2025-07-10..2025-07-15, where earth-search served an asset-incomplete item (S2A_13UEQ_20250711_0_L2A, missing scl + reflectance bands) that crashed the whole run. The immediate crash is handled by dropping the date; this issue tracks the deeper append-ordering invariant the drop exposes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions