Fix flaky DataflowRunnerTest by ignoring batch tests in streaming mode#37402
Fix flaky DataflowRunnerTest by ignoring batch tests in streaming mode#37402ATHARVA262005 wants to merge 7 commits intoapache:masterfrom
Conversation
Summary of ChangesHello @ATHARVA262005, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses flakiness in the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
Updates DataflowRunnerTest to address flakiness by conditionally skipping “batch” GroupIntoBatches override tests when running in streaming mode.
Changes:
- Added
assumeFalse(isStreaming())guards to two batch/sharded-key GroupIntoBatches override tests to skip them under streaming. - Documented the intent via inline comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Assigning reviewers: R: @shunping added as fallback since no labels match configuration Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Thanks! To correctly test this, could you revert the commit 672b888 (the change in #37363). In this case, it will re-enable the tests and also trigger the post-commit workflow. BTW, your PR seems to include some unnecessary formatting changes, which is a bit distracting for reviewing the main change. Could you kindly revert those formatting changes? Thanks! |
shunping
left a comment
There was a problem hiding this comment.
Thanks for contributing! Please see my inline comments.
|
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions. |
|
This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
Description
This PR provides the fix for the GroupIntoBatches failures identified in #37371 and #37363.
This PR fixes the
AssertionErrorinGroupIntoBatchestests identified in #37371. The root cause was thatDataflowRunnerwas skipping batch-specific overrides when Runner V2 (use_unified_worker) was enabled, causing the runner to ignore user-defined batch limits.Changes
GroupIntoBatchesandShardedKeyoverrides even when Runner V2 is active in Batch mode.testBatchGroupIntoBatchesWithShardedKeyOverrideCountV2that specifically validates this behavior with the Runner V2 experiment flags enabled.Validation
:runners:google-cloud-dataflow-java:spotlessApplyfor formatting.Fixes #37371
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.