Skip to content

GSoC Behavioral evals, Quality, and the OSS Community #23331

@gundermanc

Description

@gundermanc

This item is tracking a Google Summer of Code project during which a participant will work closely with the Agent Intelligence team at Google on Gemini CLI to improve the OSS community's ability to contribute to quality impacting areas of the Gemini CLI, including the prompt, tools, behavioral evals, and various pre-release guardrails.

The participant will do some or all of the following, depending on time:

  • Onboard to the quality area and learn how to iterate and validate changes.
  • Write and stabilize behavioral eval tests and corresponding prompt and tool changes, to refine the product's prompt driven features.
  • Write skills or subagents for helping 1st and 3rd party contributors work more effectively with evals, prompt changes, subagents, and skills.
  • Identify and fix processes, documentation, logging, and tooling gaps that impede ones ability to diagnose and fix bugs in the agent, subagent, skills, prompts, and tools.
  • Implement tooling for external contributors to inventory, assess gaps in, generate, validate, and improve behavioral evals tests.
  • Implement tooling for generating evals or unit tests from chat logs

Finally, and most importantly -- dogfood and improve the community contribution scenario for the agent intelligence area and help us to build a more vibrant quality ecosystem with the community.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/agentIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Qualitykind/enhancementpriority/p2Important but can be addressed in a future release.status/bot-triaged🔒 maintainer only⛔ Do not contribute. Internal roadmap item.

    Type

    No fields configured for Epic.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions