Skip to content

deeplearning-wisc/agentuq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

Agent UQ on $\tau^{2}$-Bench Harness

Official codebase of the paper "Uncertainty Quantification in LLM Agents: Foundations, Emerging Challenges, and Opportunities", ACL 2026, a position paper on agent uncertainty quantification (UQ).

By Changdae Oh1, Seongheon Park1, To Eun Kim2, Jiatong Li1, Wendi Li1, Samuel Yeh1,
Sean Du3, Hamed Hassani4, Paul Bogdan5, Dawn Song6, and Sharon Li1.

1University of Wisconsin--Madison, 2Carnegie Mellon University, 3Nanyang Technological University,
4University of Pennsylvania, 5University of Southern California, 6University of California, Berkeley

Paper Website Dataset

News

Code Under Cleaning Phase

almost there

Citation

@article{oh2026uncertainty,
    title={Uncertainty Quantification in LLM Agents: Foundations, Emerging Challenges, and Opportunities},
    author={Oh, Changdae and Park, Seongheon and Kim, To Eun and Li, Jiatong and Li, Wendi and Yeh, Samuel and Du, Xuefeng and Hassani, Hamed and Bogdan, Paul and Song, Dawn and Li, Sharon},
    journal={arXiv preprint arXiv:2602.05073},
    year={2026}
}

License

This work is released under the MIT License.

Acknolwedgement

This project builds on $\tau^2$-bench by Sierra Research. The original benchmark framework, domains, evaluation system, and task definitions are their work.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors