video: VP9 fallback + BYO H.264 + PyAV 15 bump by suiyoubi · Pull Request #1999 · NVIDIA-NeMo/Curator

suiyoubi · 2026-05-19T15:15:15Z

Summary

Unifies three video-encoder PRs that previously landed on r1.2.0 into a single change targeting main. Each PR is cherry-picked from its r1.2.0 squash commit; the branch carries the same three commits in order.

#	Original PR	Squash on `r1.2.0`	Scope
1	#1930 — Remove libopenh & libx264, add VP9	`79cacb469`	Drop libopenh264/libx264, add `libvpx-vp9` as the CPU fallback encoder alongside `h264_nvenc`; update `ClipTranscodingStage` validation, docs, and tutorials.
2	#1959 — FFmpeg + video processing support	`8055f35e2`	Bring-your-own H.264 install path (`install_h264_support.sh`), refactor `install_ffmpeg.sh`, bump PyAV `13.1.0 → 15.1.0`, expand `decoder_utils`, update CI/Dockerfile.
3	#1973 — Fix motion vector export	`f07fa0e19`	Add `_resolve_export_mvs_flag` to handle PyAV's `export_mvs` flag rename across versions; covers the API change introduced by the PyAV 15 bump in #1959.

Cherry-pick notes

Remove libopenh&libx264 and Add vp9 #1930 applied cleanly onto main.
Update FFmpeg and video processing support #1959 had two conflicts:
- docker/Dockerfile — kept main's --python /usr/bin/python3.13 venv pin and added the PR's PKG_CONFIG_PATH=/usr/local/lib/pkgconfig env so source-built PyAV links against the system FFmpeg.
- uv.lock — discarded the textual merge and regenerated via uv lock. The only net change vs. main's lock is av 13.1.0 → 15.1.0.
fix: update motion vector export flag handling in PyAV #1973 applied cleanly on top of the resolved Update FFmpeg and video processing support #1959.

* fix: remove libopenh264 and replace with libx264 Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: install libx264 Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: drop libx264 support Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Enhance video transcoding support by adding `libvpx-vp9` as a CPU fallback encoder alongside `h264_nvenc`. Update installation instructions and verification steps to reflect the new encoder options. Modify `ClipTranscodingStage` to validate encoder selection and handle encoding options for both encoders. Update relevant documentation and examples to guide users on using the new encoder. Signed-off-by: Ao Tang <aot@nvidia.com> * Minor update Signed-off-by: Ao Tang <aot@nvidia.com> * refine docs Signed-off-by: Ao Tang <aot@nvidia.com> * refine docs Signed-off-by: Ao Tang <aot@nvidia.com> * add back openlibh264 and inform user to install themself if needed Signed-off-by: Ao Tang <aot@nvidia.com> * docs format Signed-off-by: Ao Tang <aot@nvidia.com> --------- Signed-off-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Signed-off-by: Ao Tang <aot@nvidia.com> Co-authored-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update FFmpeg and video processing support - Upgrade PyAV dependency from version 13.1.0 to 15.1.0 in `pyproject.toml`. - Add support for software H.264/HEVC/AV1 decoders in the Curator container by introducing an opt-in script (`install_h264_support.sh`) that recompiles FFmpeg with the necessary decoders. - Enhance error handling in `VideoReaderStage` to log warnings when software codecs are missing, improving user feedback. - Update documentation to reflect changes in codec support and installation instructions for users needing software decoders. - Modify Dockerfile to ensure FFmpeg is discoverable by source-built Python dependencies. This commit aims to improve video processing capabilities and user experience when handling H.264/HEVC/AV1 inputs. Signed-off-by: Ao Tang <aot@nvidia.com> * Enhance CI workflows with FFmpeg library installation - Added steps to install FFmpeg development libraries in both `cicd-main.yml` and `install-test.yml` workflows, ensuring necessary dependencies for source-built PyAV are available. - Improved error logging in `VideoReaderStage` by simplifying the warning message for missing software codecs. - Updated test cases in `test_decoder_utils.py` to handle codec-related error messages more accurately. These changes aim to streamline the video processing pipeline and improve error handling for codec issues. Signed-off-by: Ao Tang <aot@nvidia.com> * Refactor FFmpeg installation in CI workflows and Dockerfile - Removed FFmpeg development library installation steps from `cicd-main.yml` and `install-test.yml` workflows to streamline CI processes. - Updated `Dockerfile` to enforce source-building of PyAV with the `--no-binary-package av` option, ensuring compatibility with the FFmpeg version used in the Docker image. - Added explanatory comments in `pyproject.toml` regarding the handling of PyAV dependencies in Docker. These changes aim to improve the build process and maintain consistency in dependency management across environments. Signed-off-by: Ao Tang <aot@nvidia.com> * wording Signed-off-by: Ao Tang <aot@nvidia.com> --------- Signed-off-by: Ao Tang <aot@nvidia.com> Co-authored-by: Dong Hyuk Chang <9426164+thomasdhc@users.noreply.github.com>

* fix: update motion vector export flag handling in PyAV Signed-off-by: Ao Tang <aot@nvidia.com> * ruff check Signed-off-by: Ao Tang <aot@nvidia.com> * format Signed-off-by: Ao Tang <aot@nvidia.com> * feat: implement _resolve_export_mvs_flag function for PyAV compatibility Added a new function to handle the EXPORT_MVS bitflag for different PyAV versions, ensuring compatibility and preventing silent failures in motion vector retrieval. Updated the motion vector decoding logic to utilize this new function. Added unit tests to verify the correct behavior of the flag resolution. Signed-off-by: Ao Tang <aot@nvidia.com> --------- Signed-off-by: Ao Tang <aot@nvidia.com>

suiyoubi · 2026-05-19T15:17:55Z

/ok to test 328033c

greptile-apps · 2026-05-19T15:20:39Z

Greptile Summary

This PR cherry-picks three r1.2.0 video encoder changes onto main: it drops libx264, adds libvpx-vp9 as a CPU fallback encoder, introduces a BYO-H.264 install path (install_h264_support.sh), refactors install_ffmpeg.sh to build a minimal shared-library FFmpeg from source, bumps PyAV 13.1.0 → 15.1.0, and adds _resolve_export_mvs_flag to handle the EXPORT_MVS flag rename across PyAV versions.

VP9 fallback: ClipTranscodingStage now accepts libvpx-vp9 with setup-time validation, a perf-advisory log, and CRF/VBR command-building; libopenh264 remains supported but probes for availability at setup time instead of at encode time.
BYO H.264: New install_h264_support.sh recompiles FFmpeg in-container with software h264/hevc/av1 decoders and an optional libopenh264 encoder; decoder_utils raises SoftwareCodecMissingError (with a targeted hint) when ffprobe fails due to missing NVDEC.
PyAV 15 compat: _resolve_export_mvs_flag tries lowercase export_mvs first (PyAV ≥15), falls back to EXPORT_MVS (PyAV ≤13); unit tests pin both branches so a future rename would surface as a test failure.

Confidence Score: 4/5

The core transcoding, metadata, and motion-vector paths are all covered by new unit tests; the three cherry-picked changes are logically coherent and the Docker/PyAV wiring is sound. Two narrow edge cases in error handling could mislead users but won't cause data loss or silent incorrect results.

The libopenh264 availability probe uses check=False but never inspects the return code — a broken ffmpeg binary (missing .so, bad permissions) would produce empty stdout and raise a 'codec not found' error that points the user to reinstall the codec rather than fix the real problem. Similarly, _resolve_export_mvs_flag would surface a bare AttributeError with no PyAV context if a third flag name appears in a future release. Neither path affects the default VP9 or NVENC flows.

clip_extraction_stages.py (_verify_libopenh264_available error masking) and motion_vector_backend.py (_resolve_export_mvs_flag AttributeError fallback) are worth a second look before merge.

Important Files Changed

Filename	Overview
nemo_curator/stages/video/clipping/clip_extraction_stages.py	Adds libvpx-vp9 encoder path, validates hwaccel/encoder combo in setup(), probes for libopenh264 at setup time. Minor issue: check=False in the ffmpeg probe subprocess can mask unrelated ffmpeg failures with a misleading 'libopenh264 not found' error.
nemo_curator/stages/video/filtering/motion_vector_backend.py	Adds _resolve_export_mvs_flag() to handle PyAV 13→15 rename of EXPORT_MVS. Falls back to bare AttributeError if neither name exists, which would be cryptic at runtime.
nemo_curator/utils/decoder_utils.py	Adds SoftwareCodecMissingError with MP4 FOURCC header sniff and NVDEC failure signal detection; raises a targeted error pointing at install_h264_support.sh. Logic is sound.
nemo_curator/stages/video/io/video_reader.py	Adds a specific catch for SoftwareCodecMissingError logged at ERROR level (vs WARNING for generic exceptions). Clean and correctly placed.
docker/common/install_ffmpeg.sh	Refactored from tarball download to git clone with --disable-everything + explicit allowlist; switched to shared libs for PyAV. Cleanup at end of script is present.
docker/common/install_h264_support.sh	New opt-in script to recompile FFmpeg with software h264/hevc/av1 decoders and optional libopenh264 encoder. Mirrors install_ffmpeg.sh configure flags exactly, includes skip for existing nv-codec-headers, and cleans up.
docker/Dockerfile	Adds PKG_CONFIG_PATH for PyAV to discover system FFmpeg; forces --no-binary-package av so PyAV source-builds against the container's libav*.
pyproject.toml	Bumps PyAV from 13.1.0 to 15.1.0; adds comment explaining why --no-binary-package lives in the Dockerfile rather than here.

_{Reviews (1): Last reviewed commit: "fix: update motion vector export flag ha..." | Re-trigger Greptile}

greptile-apps · 2026-05-19T15:20:43Z

+        """Probe the local FFmpeg build for libopenh264 support."""
+        ffmpeg_bin = shutil.which("ffmpeg")
+        if ffmpeg_bin is None:
+            error_msg = (
+                "Could not find `ffmpeg` on PATH while verifying libopenh264 support. "
+                f"Install FFmpeg and ensure it is on PATH. See {_BYO_H264_DOCS_URL}"
+            )
+            raise RuntimeError(error_msg)
+        try:
+            result = subprocess.run(  # noqa: S603
+                [ffmpeg_bin, "-hide_banner", "-encoders"],
+                capture_output=True,
+                text=True,
+                check=False,
+                timeout=10,
+            )
+        except subprocess.TimeoutExpired as e:
+            error_msg = f"`ffmpeg -encoders` timed out while verifying libopenh264 support. See {_BYO_H264_DOCS_URL}"
+            raise RuntimeError(error_msg) from e
+        if "libopenh264" not in result.stdout:
+            error_msg = (
+                "encoder='libopenh264' was requested but the local FFmpeg build "
+                "does not include it. Curator does not ship libopenh264 due to "
+                "its patent-license redistribution model. To enable it, install "
+                f"a libopenh264-enabled FFmpeg yourself — see {_BYO_H264_DOCS_URL}"
+            )
+            raise RuntimeError(error_msg)


Broken ffmpeg masks "libopenh264 not found" for unrelated failures

subprocess.run(..., check=False) means any non-zero exit from ffmpeg — e.g., a missing shared library at container boot, a corrupted binary, or a permission error — leaves result.stdout empty or truncated. The subsequent if "libopenh264" not in result.stdout check then unconditionally raises a RuntimeError blaming a missing libopenh264 codec, when the real failure is something entirely different. A user following the linked docs to reinstall the codec would not fix the problem.

greptile-apps · 2026-05-19T15:20:44Z

 _MIN_SIDE_RESOLUTION = 256


+def _resolve_export_mvs_flag() -> int:
+    """Return the EXPORT_MVS bitflag, accepting either the PyAV >=15 lowercase
+    name (``export_mvs``) or the PyAV <=13 uppercase name (``EXPORT_MVS``).
+
+    The enum member was renamed between PyAV 13 and 15. Tests for both branches
+    pin this contract so a future PyAV bump that renames it again surfaces as
+    a failed unit test rather than silently zero motion vectors at runtime.
+    """
+    flags2 = av.codec.context.Flags2
+    flag = getattr(flags2, "export_mvs", None)


Bare AttributeError if neither flag name exists

If a future PyAV release renames the flag a third time, getattr(flags2, "export_mvs", None) returns None and flags2.EXPORT_MVS raises AttributeError: type object 'Flags2' has no attribute 'EXPORT_MVS' — no stack context, no mention of PyAV, no version hint. The docstring says "surfaces as a failed unit test", but a runtime hit (e.g., in a worker that isn't under test) would produce a completely opaque crash. A try/except AttributeError with a message pointing to the PyAV version would make the failure self-diagnosable.

jgerh

Completed tech pubs review and provided a few copyedits/suggested text revisions for clarity.

jgerh · 2026-05-19T18:15:28Z

 |----------|-------------|-----|
 | **Local Development** | Minimum specs listed above | Continue below |
 | **Production Clusters** | Detailed hardware, network, storage specs | [Deployment Requirements](deployment/requirements.md) |
 | **Multi-node Setup** | Advanced infrastructure planning | [Deployment Options](deployment/index.md) |


Deployment Options links to prerequisites, not multi-node setup or advanced infrastructure planning. However, both topics are documented — multi-node setup lives under docs/admin/deployment/slurm/, and advanced infrastructure planning lives under docs/reference/infrastructure/. Neither is surfaced from the main /admin/deployment landing page.

jgerh · 2026-05-19T19:07:38Z

 Choose one of the following installation methods based on your needs:

 :::{tip}
 **Docker is the recommended installation method** for video and audio workflows. The NeMo Curator container includes FFmpeg (with NVENC support) pre-configured, avoiding manual dependency setup. Refer to the [Container Installation](#container-installation) tab below.


Suggested change

**Docker is the recommended installation method** for video and audio workflows. The NeMo Curator container includes FFmpeg (with NVENC support) pre-configured, eliminating the need for manual dependency setup. Refer to the [Container Installation](#container-installation) tab below.

jgerh · 2026-05-19T19:09:24Z

+- **H.264 inputs in CPU-only pipeline stages.** `VideoReader` and `ClipWriter` invoke `ffprobe` from CPU-only Ray actors that can't see the GPU; they need a software `h264`/`hevc`/`av1` decoder to extract metadata. Without it you'll get a `SoftwareCodecMissingError` pointing back here.
+- **H.264 software encoding** (for example, on GPUs without an NVENC encoder block such as A100 or H100, when VP9 isn't acceptable).
+
+#### Option 1: Run the bundled installer inside the container (Recommended)


Suggested change

#### Option 1: Run the bundled installer inside the container (Recommended)

#### Option 1: Run the Bundled Installer Inside the Container (Recommended)

jgerh · 2026-05-19T19:09:47Z

+
+The build takes ~5–10 minutes, replaces `/usr/local/bin/{ffmpeg,ffprobe}` in place, and pins to the same FFmpeg tag as the image build. Script source: [docker/common/install_h264_support.sh](https://github.com/NVIDIA-NeMo/Curator/blob/main/docker/common/install_h264_support.sh).
+
+License notice: the default mode adds only FFmpeg-internal decoders (LGPL). With `--with-libopenh264` the binary additionally links Cisco's OpenH264 (BSD-2-Clause + Cisco-distributed binary license — see https://www.openh264.org/BINARY_LICENSE.txt). You are responsible for any license obligations the resulting binaries impose on your distribution.


Suggested change

License notice: the default mode adds only FFmpeg-internal decoders (LGPL). With `--with-libopenh264` the binary additionally links Cisco's OpenH264 (BSD-2-Clause + Cisco-distributed binary license — see https://www.openh264.org/BINARY_LICENSE.txt). You are responsible for any license obligations the resulting binaries impose on your distribution.

License notice: The default mode adds only FFmpeg-internal decoders (LGPL). With `--with-libopenh264` the binary additionally links Cisco's OpenH264 (BSD-2-Clause + Cisco-distributed binary license — see https://www.openh264.org/BINARY_LICENSE.txt). You are responsible for any license obligations the resulting binaries impose on your distribution.

jgerh · 2026-05-19T19:17:43Z


 ## Troubleshooting

 - "Encoder not found": Your `ffmpeg` build may lack the encoder; verify with `ffmpeg -encoders`.


Suggested change

- `Encoder not found`: Your `ffmpeg` build may lack the encoder; verify with `ffmpeg -encoders`.

jgerh · 2026-05-19T19:17:58Z

 ## Troubleshooting

 - "Encoder not found": Your `ffmpeg` build may lack the encoder; verify with `ffmpeg -encoders`.
 - "No NVENC capable devices found": Install NVIDIA drivers/CUDA and ensure the GPU is visible in `nvidia-smi`.


Suggested change

- `No NVENC capable devices found`: Install NVIDIA drivers/CUDA and ensure the GPU is visible in `nvidia-smi`.

jgerh · 2026-05-19T19:27:11Z

   You can reuse the same `<MODEL_DIR>` across runs.
   :::

 2. No additional setup is required. The model will be downloaded automatically when first used.


Suggested change

2. Verify that `MODEL_DIR` is writable. No additional setup is required.

jgerh · 2026-05-19T19:29:09Z

@@ -1 +1 @@
 # Getting Started with Video Curation


Suggested change

# Get Started with Video Curation

jgerh · 2026-05-19T19:30:46Z

  --fixed-stride-split-duration 10.0 \
  --embedding-algorithm cosmos-embed1-224p
 ```
 This example extends from the above example and adds an additional embedding stages using `cosmos-embed1-224p` model. Use `--model-dir "$MODEL_DIR"` if the model is predownloaded.


Suggested change

This example extends the example above and adds an embedding stage using the `cosmos-embed1-224p` model. Use `--model-dir "$MODEL_DIR"` if the model is already downloaded.

jgerh · 2026-05-19T19:31:31Z

  --verbose
 ```
-This example demonstrates a more advanced workflow than the minimal example by using scene-aware splitting with the TransNetV2 algorithm (which detects scene boundaries instead of fixed intervals), applies the Cosmos-Embed1 embedding model to each clip, transcodes the output using the `libopenh264` encoder, and enables verbose logging for more detailed output.
+This example demonstrates a more advanced workflow than the minimal example by using scene-aware splitting with the TransNetV2 algorithm (which detects scene boundaries instead of fixed intervals), applies the Cosmos-Embed1 embedding model to each clip, transcodes the output using the `h264_nvenc` encoder, and enables verbose logging for more detailed output. On GPUs without NVENC (such as A100/H100), pass `--transcode-encoder libvpx-vp9` instead — VP9 is a royalty-free CPU encoder that produces clips in the same `.mp4` container.


Suggested change

This example demonstrates a more advanced workflow than the minimal example by using scene-aware splitting with the TransNetV2 algorithm (which detects scene boundaries instead of fixed intervals), applies the Cosmos-Embed1 embedding model to each clip, transcodes the output using the `h264_nvenc` encoder, and enables verbose logging for more detailed output. On GPUs without NVENC (such as A100/H100), pass `--transcode-encoder libvpx-vp9` instead — VP9 is a royalty-free CPU encoder that produces clips in the same `.mp4` container.

This example demonstrates a more advanced workflow than the minimal example by using scene-aware splitting with the TransNetV2 algorithm (which detects scene boundaries instead of fixed intervals), applying the Cosmos-Embed1 embedding model to each clip, transcoding the output using the `h264_nvenc` encoder, and enabling verbose logging for more detailed output. On GPUs without NVENC (such as A100/H100), pass `--transcode-encoder libvpx-vp9` instead — VP9 is a royalty-free CPU encoder that produces clips in the same `.mp4` container.

suiyoubi and others added 3 commits May 19, 2026 08:05

suiyoubi requested review from a team and abhinavg4 as code owners May 19, 2026 15:15

copy-pr-bot Bot temporarily deployed to public May 19, 2026 15:15 Inactive

copy-pr-bot Bot temporarily deployed to test May 19, 2026 15:15 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci May 19, 2026 15:15 Inactive

copy-pr-bot Bot had a problem deploying to nemo-ci May 19, 2026 15:15 Failure

copy-pr-bot Bot temporarily deployed to nemo-ci May 19, 2026 15:15 Inactive

copy-pr-bot Bot temporarily deployed to public May 19, 2026 15:19 Inactive

copy-pr-bot Bot temporarily deployed to public May 19, 2026 15:20 Inactive

greptile-apps Bot reviewed May 19, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to public May 19, 2026 15:26 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci May 19, 2026 15:27 Inactive

copy-pr-bot Bot had a problem deploying to nemo-ci May 19, 2026 15:27 Failure

copy-pr-bot Bot temporarily deployed to nemo-ci May 19, 2026 15:27 Inactive

copy-pr-bot Bot had a problem deploying to nemo-ci May 19, 2026 17:35 Failure

jgerh reviewed May 19, 2026

View reviewed changes

suiyoubi mentioned this pull request May 20, 2026

remove h264 and use VP9 fallback #2003

Open


	Docker is the recommended installation method for video and audio workflows. The NeMo Curator container includes FFmpeg (with NVENC support) pre-configured, eliminating the need for manual dependency setup. Refer to the [Container Installation](#container-installation) tab below.

	#### Option 1: Run the bundled installer inside the container (Recommended)
	#### Option 1: Run the Bundled Installer Inside the Container (Recommended)


		The build takes ~5–10 minutes, replaces `/usr/local/bin/{ffmpeg,ffprobe}` in place, and pins to the same FFmpeg tag as the image build. Script source: [docker/common/install_h264_support.sh](https://github.com/NVIDIA-NeMo/Curator/blob/main/docker/common/install_h264_support.sh).

		License notice: the default mode adds only FFmpeg-internal decoders (LGPL). With `--with-libopenh264` the binary additionally links Cisco's OpenH264 (BSD-2-Clause + Cisco-distributed binary license — see https://www.openh264.org/BINARY_LICENSE.txt). You are responsible for any license obligations the resulting binaries impose on your distribution.


		## Troubleshooting

		- "Encoder not found": Your `ffmpeg` build may lack the encoder; verify with `ffmpeg -encoders`.


	- `Encoder not found`: Your `ffmpeg` build may lack the encoder; verify with `ffmpeg -encoders`.


	- `No NVENC capable devices found`: Install NVIDIA drivers/CUDA and ensure the GPU is visible in `nvidia-smi`.


	2. Verify that `MODEL_DIR` is writable. No additional setup is required.


	This example extends the example above and adds an embedding stage using the `cosmos-embed1-224p` model. Use `--model-dir "$MODEL_DIR"` if the model is already downloaded.

Conversation

suiyoubi commented May 19, 2026

Summary

Cherry-pick notes

Uh oh!

suiyoubi commented May 19, 2026

Uh oh!

greptile-apps Bot commented May 19, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

jgerh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants