Skip to content

Bug in Paligemma usage docs for v4.50.3 #37181

@EricCousineau-TRI

Description

@EricCousineau-TRI

The following doc version has a minor bug in its usage:
https://github.com/huggingface/transformers/blob/v4.50.3/docs/source/en/model_doc/paligemma.md#single-image-inference
At the time of writing, this is the default version of the docs people come across via
https://huggingface.co/docs/transformers/en/model_doc/paligemma

This has bug where it's using the tokenized input length:

print(processor.decode(output[0], skip_special_tokens=True)[inputs.input_ids.shape[1]: ])

but it should actually be the text input length:

print(processor.decode(output[0], skip_special_tokens=True)[len(prompt): ])

Found this out by cross-referencing the HF spaces example:
https://huggingface.co/spaces/big-vision/paligemma-hf/blob/d914d44/app.py#L38

Note sure if it's worth patching the existing docs, having a new minor release, or just closing this out as a note for others.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions