Skip to content

Fix/clean error handling for runners#259

Draft
JohnPraveenYL wants to merge 5 commits intodevelopfrom
fix/clean-error-handling-for-runners
Draft

Fix/clean error handling for runners#259
JohnPraveenYL wants to merge 5 commits intodevelopfrom
fix/clean-error-handling-for-runners

Conversation

@JohnPraveenYL
Copy link
Copy Markdown
Contributor

@JohnPraveenYL JohnPraveenYL commented Apr 8, 2026

Description

This PR fixes inconsistent and noisy runtime error handling across Agent Kernel runners, with a focus on issue #247.
It standardizes user-facing error messages, prevents traceback-heavy CLI output in normal usage, and stabilizes example environments (especially LangGraph) so failures are reproducible and readable.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update
  • CI/CD update
  • Other (please describe):

Related Issues

Fixes #247
Relates to #247

Changes Made

  • Added centralized user-facing error normalization for provider/runtime failures (including 503/429/high-demand style failures).
  • Updated runner error handling consistency across ADK, CrewAI, LangGraph, and OpenAI to return clean AgentReplyText errors instead of bubbling raw framework exceptions.
  • Improved ADK runner behavior to avoid noisy thread-level exception leakage and avoid None-text reply issues.
  • Improved LangGraph content extraction and exception handling paths for robust empty/non-text responses.
  • Updated CLI logging behavior to keep default output user-friendly (single-line errors) and keep full traces at debug level.

Testing

  • Unit tests pass locally
  • Integration tests pass locally
  • Manual testing completed
  • New tests added for changes

Manual checks performed:

  • Built package locally (build.sh).
  • Ran demo flows for ADK, OpenAI, CrewAI, and LangGraph.
  • Verified simulated failure paths produce clean user-visible errors and CLI remains interactive.

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Screenshots (if applicable)

N/A

Additional Notes

@JohnPraveenYL JohnPraveenYL requested a review from amithad April 8, 2026 15:07
Comment thread ak-py/src/agentkernel/cli/cli.py Outdated
Comment thread ak-py/src/agentkernel/cli/cli.py Outdated
@@ -101,7 +117,8 @@ async def run(self):
raise
except Exception as e:
self._print(f"Error: {e}")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are printing the log here anyway right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but the user sees only the short error message and the debug log keeps the full traceback for troubleshooting.
So it is useful, but not required for normal output.

Comment thread ak-py/src/agentkernel/core/util/error_util.py Outdated
Comment thread ak-py/src/agentkernel/core/util/error_util.py
break
response_text = ""

if hasattr(runner, "run_async"):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why check "run_async"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case, I kept run_async because I saw issues when it was not checked and the code broke in some setups. Some runner versions support async and some do not so this check helps the code work safely in both cases.

Comment thread ak-py/src/agentkernel/framework/langgraph/langgraph.py Outdated
@JohnPraveenYL JohnPraveenYL requested a review from amithad April 14, 2026 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] LLM errors produce garbled CLI output — missing/inconsistent error handling across runners

2 participants