Reset DecodingPress state when exiting context manager by cluster2600 · Pull Request #192 · NVIDIA/kvpress

cluster2600 · 2026-03-10T19:49:25Z

Summary

DecodingPress.layer_step_counts and hidden_states_buffer were not being reset between successive uses of teh context manager. When the evaluation framework runs DecodingPress across multiple questions (e.g. aime25 with 30 questions), stale state from a previous question's decoding carried over to the next, causing inconsistent compression behaviour depending on input order.
Overrides __call__ on DecodingPress to call self.reset() when the context manager exits. This mirrors the pattern already recognised in PrefillDecodingPress.__call__, which resets its inner decoding_press in its finally block.

Test plan

Existing test_decoding_compression.py tests should continue to pass (they already reuse the same press object across calls)
Verify that test_decoding_press_equivalence still produces identical results for standalone vs combined press
Run an evaluation with DecodingPress on a multi-question benchmark (e.g. aime25) and confirm scores are now deterministic regardless of question order

Fixes #191

DecodingPress.layer_step_counts and hidden_states_buffer were not being reset between uses when the press was used as a context manager. This caused stale state to carry over between different questions in the evaluation framework (e.g. aime25 with 30 questions), leading to inconsistent compression behaviour. Override __call__ to call self.reset() on exit, mirroring the pattern already used by PrefillDecodingPress. Fixes NVIDIA#191 Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>

copy-pr-bot · 2026-03-10T19:49:30Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

maxjeblick · 2026-03-11T09:31:28Z

HI @cluster2600 , thanks a lot for opening this PR!
We will look into the issue and figure out whether we want to fix the bug at the press level (in this case the PR looks probably fine) or if we choose another modification.

I think it would have been nice to sync with @Yeuvoir beforehand, as he opened the issue and also expressed interest in opening a PR.

cluster2600 · 2026-03-11T10:09:32Z

Hi @maxjeblick, thanks for the feedback — apologies for not coordinating with @Yeuvoir beforehand, that was an oversight on my part. @Yeuvoir, happy to defer to you if you'd prefer to submit your own PR, or if you'd like to collaborate on this one I'm open to that as well. Let me know what works best for you both and I'll adapt accordinly.

maxjeblick

THanks a lot LGTM!

maxjeblick approved these changes Mar 12, 2026

View reviewed changes

maxjeblick merged commit 04abb96 into NVIDIA:main Mar 12, 2026
2 checks passed

cluster2600 deleted the fix/reset-decoding-press-state-between-questions branch March 12, 2026 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reset DecodingPress state when exiting context manager#192

Reset DecodingPress state when exiting context manager#192
maxjeblick merged 1 commit into
NVIDIA:mainfrom
cluster2600:fix/reset-decoding-press-state-between-questions

cluster2600 commented Mar 10, 2026

Uh oh!

copy-pr-bot Bot commented Mar 10, 2026

Uh oh!

maxjeblick commented Mar 11, 2026

Uh oh!

cluster2600 commented Mar 11, 2026

Uh oh!

maxjeblick left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cluster2600 commented Mar 10, 2026

Summary

Test plan

Uh oh!

copy-pr-bot Bot commented Mar 10, 2026

Uh oh!

maxjeblick commented Mar 11, 2026

Uh oh!

cluster2600 commented Mar 11, 2026

Uh oh!

maxjeblick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants