Official codebase of the paper "Uncertainty Quantification in LLM Agents: Foundations, Emerging Challenges, and Opportunities", ACL 2026, a position paper on agent uncertainty quantification (UQ).
By Changdae Oh1, Seongheon Park1, To Eun Kim2, Jiatong Li1, Wendi Li1, Samuel Yeh1,
Sean Du3, Hamed Hassani4, Paul Bogdan5, Dawn Song6, and Sharon Li1.
1University of Wisconsin--Madison, 2Carnegie Mellon University, 3Nanyang Technological University,
4University of Pennsylvania, 5University of Southern California, 6University of California, Berkeley
- [Apr 10, 2026] $\tau^2$-bench UQ artifacts (actual trajectories and uncertainty measurements) used in our paper are now available on HuggingFace datasets🤗
- [Apr 5, 2026] AgentUQ position paper got accepted to ACL 2026 (main conference)🎉
- [Feb 26, 2026] AgentUQ position paper got accepted to ICLR 2026 Workshop, Agentic AI in the Wild: From Hallucinations to Reliable Autonomy🎉
almost there
@article{oh2026uncertainty,
title={Uncertainty Quantification in LLM Agents: Foundations, Emerging Challenges, and Opportunities},
author={Oh, Changdae and Park, Seongheon and Kim, To Eun and Li, Jiatong and Li, Wendi and Yeh, Samuel and Du, Xuefeng and Hassani, Hamed and Bogdan, Paul and Song, Dawn and Li, Sharon},
journal={arXiv preprint arXiv:2602.05073},
year={2026}
}
This work is released under the MIT License.
This project builds on $\tau^2$-bench by Sierra Research. The original benchmark framework, domains, evaluation system, and task definitions are their work.