TensorLens reformulates entire transformer models as single, input-dependent linear operators expressed through high-order attention tensors. Unlike prior methods that analyze individual attention heads or layers, our tensor jointly encodes attention, FFNs, activations, normalizations, and residual connections—providing a theoretically grounded and complete linear representation of the model's computation.
This repo contains tools to extract these tensors and compute input-to-output relevance scores for both language and vision transformers.
📄 Paper: arxiv
git clone https://github.com/idoatad/TensorLens.git && cd TensorLens
uv venv && source .venv/bin/activate
uv syncRequires Python 3.8+ and uv. Install uv:
curl -LsSf https://astral.sh/uv/install.sh | sh
- demo_notebook.ipynb contains:
- Instructions on how to extract the model Tensors given a model and an input.
- Demonstration of how to compute the relevance of the input to the output using the Tensors, and the baseline methods, for both text and image models.
-
Language decoder only models:
- facebook/opt
- EleutherAI/pythia
- microsoft/Phi-1.5
- meta-llama/Llama
-
Language encoder models:
- BERT
- RoBERTa
-
Image models:
- ViT
-
To add support for additional models, follow the instructions in docs/ADDING_MODELS.md
