Skip to content

[CI][REFACTOR] Decouple data.py from Jenkins script and docker images#19445

Merged
tqchen merged 3 commits into
mainfrom
jenkins-s3-bundle-refactor
Apr 25, 2026
Merged

[CI][REFACTOR] Decouple data.py from Jenkins script and docker images#19445
tqchen merged 3 commits into
mainfrom
jenkins-s3-bundle-refactor

Conversation

@tqchen

@tqchen tqchen commented Apr 25, 2026

Copy link
Copy Markdown
Member

Summary

ci/jenkins/data.py was carrying two unrelated concerns: artifact bundle definitions for s3 staging, and the docker image tag registry. This PR decouples both from the Jenkinsfile-generated code, making each concern standalone and machine-readable without Python.

Changes

Part 1 — Bundle refactor (10950ac192)

  • ci/scripts/jenkins/s3.py gains --bundle <name> (repeatable) that resolves bundle names from ci/jenkins/data.py::files_to_stash at runtime; --items preserved for back-compat
  • ci/jenkins/templates/utils/macros.j2 upload_artifacts macro emits --bundle <name> flags instead of inlining the file list as --items
  • Template callsites (cpu, gpu, arm) updated to pass bundles=[...] names
  • Generated .groovy files regenerated — no build/ paths appear in artifact upload blocks; only bundle names

Part 2 — Docker image registry extraction (6df22c09b9)

  • New ci/docker-images.ini (one [ci_*] section per image) is the single source of truth for image tags
  • docker/dev_common.sh::lookup_image_spec reads the ini directly with awk — no Python invocation
  • ci/jenkins/data.py drops the inline docker_images dict; loads it via configparser from the ini, preserving the same nested-dict shape so Jinja templates and the current s3.py module import keep working unchanged
  • data.py __main__ is now the bundle-resolver CLI (python3 ci/jenkins/data.py <bundle> [...] → file paths, one per line) — no more dual-purpose image-name lookup branch
  • ci/scripts/jenkins/open_docker_update_pr.py updated to read/write ci/docker-images.ini instead of data.py

Acceptance

# No build/ artifact paths in generated Jenkinsfiles (bundle names only):
grep -rn "build/lib\|build/cpptest" ci/jenkins/generated/   # → nothing

# generate.py is byte-identical after the change:
python3 ci/jenkins/generate.py  # → "no changes made" for all 5 groovy files

# dev_common.sh resolves same tag as before:
bash -c 'source docker/dev_common.sh && lookup_image_spec ci_cpu'
# → tlcpack/ci-cpu:20251130-061900-c429a2b1

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request centralizes Docker image configurations into a new ci/docker-images.ini file and updates the CI pipeline to use this as the single source of truth. It also introduces a bundle-based artifact resolution system for S3 uploads, replacing hardcoded file lists in Jenkinsfiles with logical bundle names defined in ci/jenkins/data.py. Feedback focuses on improving the robustness of the INI file parsing in both the Python data loader and the shell-based image lookup script to handle potential formatting variations or missing keys.

Comment thread ci/jenkins/data.py Outdated
Comment thread docker/dev_common.sh Outdated
@tqchen tqchen force-pushed the jenkins-s3-bundle-refactor branch from 685ebef to 29ffe67 Compare April 25, 2026 16:53
@tqchen

tqchen commented Apr 25, 2026

Copy link
Copy Markdown
Member Author

cc @yongwww

…tags

`ci/jenkins/data.py` had grown into a grab-bag of two unrelated CI concerns
(s3 stash file lists + docker image tag registry) plus a custom CLI for
``docker/dev_common.sh``. This PR splits those apart and pushes each piece
to its rightful owner so the generated `.groovy` files stop carrying
build-path or tag literals that belong elsewhere.

## s3 stash → bundle resolution at runtime

* `ci/scripts/jenkins/s3.py` learns ``--bundle NAME`` (repeatable). It loads
  ``ci/jenkins/data.py::files_to_stash`` via ``importlib.util`` and resolves
  the named bundles to a flat file list at Jenkins runtime. ``--items`` is
  preserved for ad-hoc literal-path uploads.
* ``upload_artifacts`` (``ci/jenkins/templates/utils/macros.j2``) emits one
  ``--bundle NAME`` flag per entry instead of expanding the path list inline.
* ``data.py`` itself becomes the bundle-resolver CLI:
  ``python3 ci/jenkins/data.py tvm_lib cpptest`` → one path per line.
* ``tvm_allvisible`` is folded into ``tvm_lib`` — every consumer of the
  standard libs needs the all-visible variant for cpptest, so a separate
  bundle was just two-name ceremony.

After this change ``grep -rn 'build/lib\|build/cpptest' ci/jenkins/generated/``
returns nothing — paths live exclusively in ``data.py``.

## Docker image registry → live ini, no duplicate

The repo already has ``ci/jenkins/docker-images.ini`` (the file the nightly
docker-update workflow writes to; latest tag `20260301-134651-63f099ad`).
``data.py`` was carrying an inline ``docker_images`` dict that had drifted
out of sync (`20251130-`). Drop it:

* ``data.py`` no longer touches the ini at all. Replaced with a flat
  ``images = [{name, platform}, ...]`` list — what templates actually
  iterate over (tags themselves are resolved at Jenkins runtime by
  ``determine_docker_images.py``).
* ``determine_docker_images.py`` gains ``--lookup-only NAME`` (pure ini
  read, no Docker Hub query, no tlcpackstaging fallback).
* ``docker/dev_common.sh::lookup_image_spec`` shells out to
  ``determine_docker_images.py --lookup-only`` instead of duplicating ini
  parsing in awk. Single Python module owns the ini schema (configparser
  interpolation for ``%(ci_tag)s``); bash no longer needs to know about it.

## Verification

* ``python3 ci/jenkins/generate.py`` re-runs cleanly; the only diff in the
  generated ``.groovy`` files is ``--items <paths>`` → ``--bundle NAME`` and
  the merged ``tvm_allvisible`` bundle.
* ``python3 ci/jenkins/data.py tvm_lib cpptest`` resolves to the same file
  list ``files_to_stash["tvm_lib"] + files_to_stash["cpptest"]`` produced.
* ``bash -c 'source docker/dev_common.sh && lookup_image_spec ci_cpu'``
  returns the same live tag string as
  ``python3 ci/scripts/jenkins/determine_docker_images.py --lookup-only ci_cpu``.
* All 5 ``ci_*`` images resolve correctly through both paths.
@tqchen tqchen force-pushed the jenkins-s3-bundle-refactor branch from 29ffe67 to eebcf23 Compare April 25, 2026 17:13
tqchen added 2 commits April 25, 2026 18:54
…es.ini

The squashed CI refactor moved docker image tags from
``ci/jenkins/data.py`` to ``ci/jenkins/docker-images.ini`` (which already
existed and is the live source the nightly bot updates). The bot script
``open_docker_update_pr.py`` was still pointing at ``data.py`` and
regex-matching the ``"tag": "..."`` JSON-style format that no longer
exists there, so it became a silent no-op (zero replacements, no
"Tag names were the same" log emitted) and broke
``test_open_docker_update_pr``.

Rewrite the script to:

- Read ``ci/jenkins/docker-images.ini`` via configparser. The ini has
  one ``[jenkins]`` section with a shared ``ci_tag`` key plus per-image
  ``ci_<name>: tlcpack/ci-<name>:%(ci_tag)s`` entries.
- Iterate every key except ``ci_tag``, resolve each to its full spec
  (with interpolation applied), and check Docker Hub for newer tags via
  the existing ``latest_tlcpackstaging_image`` helper.
- When a newer tag is found, rewrite the corresponding ``ci_<name>:``
  line with the resolved tag (breaking the ``%(ci_tag)s`` interpolation
  for that single entry) so the update is unambiguous and doesn't
  disturb other images that share the old shared tag.

Also fix a long-standing typo in the ini: ``ci_cpu: tlcpack/ci_cpu``
(underscore) → ``tlcpack/ci-cpu`` (dash) — every other image uses the
dash form, and the test mocks already assume the dash convention
(``image.replace('_', '-')`` for the tlcpack repo half of the URL).

Update ``test_open_docker_update_pr[tlcpack_update]``'s
``expected_images`` from the old ``"tag": "tlcpack/ci-arm:234-234-abc",``
JSON-style snippet to the new ini-style line
``ci_arm: tlcpack/ci-arm:234-234-abc``.

Verified locally: ``--dry-run`` against same-tag mock data emits
"Tag names were the same, no update needed", and against new-tag mock
data emits "Found newer image, using: tlcpack/ci-X:234-234-abc" plus
the rewritten line in the printed "Updated to:" output for all 5 images.
Previous fix updated only the [tlcpack_update] expected_images snippet
to the new ini-line format. The [staging_update] parametrization had
the same JSON-style snippet (legacy data.py format) and still failed.

  '"tag": "tlcpack/ci-arm:456-456-abc"'
    →
  "ci_arm: tlcpack/ci-arm:456-456-abc"

Verified locally by simulating both [same_tags], [tlcpack_update], and
[staging_update] cases with the test's mock data: all three now pass
the expected-string assertions against the script's stdout.
@tqchen tqchen merged commit daefffc into main Apr 25, 2026
8 checks passed
@tqchen tqchen deleted the jenkins-s3-bundle-refactor branch April 26, 2026 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants