Improve niah test by alessiodevoto · Pull Request #133 · NVIDIA/kvpress

alessiodevoto · 2025-09-01T12:18:08Z

PR description

This PR contains minimal improvements over the recently merged NIAH test. It allows to insert the needle at different depths (to avoid loading the model in memory multiple times) and fixes a minor bug in the evaluation pipeline, allowing for model = "auto".

Checklist

Tests are working (make test)
Code is formatted correctly (make style, on errors try fix with make format)
Copyright header is included
All commits are signed-off using git commit -s
(new press) mypress_press.py is in the presses directory
(new press) MyPress is in __init__.py
(new press) README.md is updated with a 1 liner about the new press in the Available presses section
(new press) New press is in the default_presses list in tests/default_presses.py
(new press) A docstring is provided that follows the same structure as the existing ones

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

copy-pr-bot · 2025-09-01T12:18:12Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

alessiodevoto · 2025-09-01T12:18:57Z

/ok to test bf80954

JoelSeniorLiang · 2025-09-02T00:59:48Z

Another issue, fix the dataset by llama's tokenizer will result in longer sequence in NIAH-like tasks for others, like Qwen family, since they use different lengths in digitals for one token. That usually leads to a lower performance for Qwen in KVPress eval utilities.

alessiodevoto · 2025-09-02T06:55:12Z

Hi @JoelSeniorLiang, true! That only applies to RULER though, where we have pre-computed lengths. Here I wanted to avoid this problem (as you said it is important for NIAH), so we tokenize with the model's own tokenizer and count length and depth with that one !

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

alessiodevoto · 2025-09-02T07:07:45Z

/ok to test b51f0fd

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

alessiodevoto · 2025-09-04T13:03:48Z

/ok to test a99fcf2

alessiodevoto added 9 commits August 25, 2025 19:00

improve ea

8737b56

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

update niah for multi depth

c41dfbb

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

support Gemma3

1e207e5

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

Merge branch 'aledev/improved_ea' into aledev/needle2

a26634c

custom niah

fc7ea75

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

update passkey

ac6e141

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

clean

e33a3a9

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

polish

8e1b064

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

Merge branch 'main' into aledev/passkey

bf80954

alessiodevoto added 2 commits September 2, 2025 07:01

more details in readme

e4b969b

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

readme

b51f0fd

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

alessiodevoto added 2 commits September 4, 2025 12:52

merge main into aledev/passkey

681d107

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>

Merge branch 'main' into aledev/passkey

a99fcf2

alessiodevoto requested a review from Jack-Yu-815 September 4, 2025 13:05

Jack-Yu-815 approved these changes Sep 4, 2025

View reviewed changes

alessiodevoto merged commit 70ccaf1 into main Sep 5, 2025
3 checks passed

alessiodevoto deleted the aledev/passkey branch September 5, 2025 16:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve niah test#133

Improve niah test#133
alessiodevoto merged 13 commits into
mainfrom
aledev/passkey

alessiodevoto commented Sep 1, 2025

Uh oh!

copy-pr-bot Bot commented Sep 1, 2025

Uh oh!

alessiodevoto commented Sep 1, 2025

Uh oh!

JoelSeniorLiang commented Sep 2, 2025

Uh oh!

alessiodevoto commented Sep 2, 2025 •

edited

Loading

Uh oh!

alessiodevoto commented Sep 2, 2025

Uh oh!

alessiodevoto commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alessiodevoto commented Sep 1, 2025

PR description

Checklist

Uh oh!

copy-pr-bot Bot commented Sep 1, 2025

Uh oh!

alessiodevoto commented Sep 1, 2025

Uh oh!

JoelSeniorLiang commented Sep 2, 2025

Uh oh!

alessiodevoto commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alessiodevoto commented Sep 2, 2025

Uh oh!

alessiodevoto commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alessiodevoto commented Sep 2, 2025 •

edited

Loading