Skip to content

AiDave71/kilocode

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18,035 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

VS Code Marketplace X (Twitter) Substack Blog Discord Reddit

Kilo Code — Multi-Provider Speech Synthesis

This fork adds a full Speech tab to Kilo Code — hear AI responses spoken aloud with 6 text-to-speech providers, all with free tiers. Works out of the box with zero setup using the built-in Browser provider.

PR #8839 Feature Branch


Speech Feature Highlights

Works immediately — no API keys, no accounts, no cost. The Browser provider uses your system's built-in speech engine. Want premium neural voices? Add an API key for any of the 5 cloud providers — all have generous free tiers.

Provider Selector

Choose from 6 providers, organized by setup requirement:

Speech provider dropdown showing 6 providers in two groups: No Setup Required and Free Tier Available

Provider Free Tier Voices Setup
Browser (Web Speech API) Unlimited, offline System voices None
Azure Cognitive Services 500K chars/month 125+ neural voices API key
Google Cloud TTS 4M chars/month 21 Neural2 + Studio API key
OpenAI TTS $5 free credit 10 voices API key
ElevenLabs 10K chars/month 10 expressive voices API key
Amazon Polly 5M chars/month (12 mo) 20 voices, 7 locales Access key

Full Settings View — Azure Configured

The complete Speech tab with Azure connected, auto-speak enabled, and voice browser showing 99 neural voices:

Full Speech settings panel with Azure configured, showing API key, region, toggles, and voice browser

Browser Provider — Free & Offline

Zero-setup speech using your operating system's built-in voices. Works offline, no account needed:

Browser provider selected showing free offline speech with system voices and interaction mode dropdown

Voice Browser & Fine-Tuning

Browse voices by locale, preview them, set favorites, and fine-tune pitch, rate, emphasis, and custom pronunciations:

Voice browser with 99 voices listed and fine-tuning controls for pitch, rate, emphasis, and pronunciations

More Screenshots

Azure Voice List with Favorites

Azure voice list with Maisie favorited, showing voice descriptions and play buttons

Voice Fine-Tuning Detail

Detailed voice fine-tuning showing pitch, rate, sentence pause, paragraph break, emphasis, pronunciations, and save as preset


Key Capabilities

  • Auto-Speak — AI responses are spoken automatically when they finish
  • Stop on Typing — Speech interrupts instantly when you start typing
  • Interaction Modes — Assist (important responses only), Conversation (all replies), or Minimal (on request)
  • Sentiment Detection — Voice pitch and rate shift to match emotional tone
  • Multi-Voice Dialogue — Each AI agent speaks in a distinct voice
  • Voice Favorites & Presets — Star voices you like, save full configurations as presets
  • 25-Rule Text Filter — Strips markdown, code blocks, URLs, and formatting before speaking
  • LRU Synthesis Cache — Repeated phrases play instantly from a 32-entry cache
  • SSML Support — Full SSML, styles, emphasis, and pronunciation controls (provider-dependent)

Architecture

SpeechProvider (interface)
    |
    +-- BrowserProvider      (Web Speech API, offline)
    +-- AzureProvider        (wraps existing tts-azure.ts)
    +-- GoogleProvider       (REST, base64 audio response)
    +-- OpenAIProvider       (REST, Bearer auth)
    +-- ElevenLabsProvider   (REST, xi-api-key header)
    +-- PollyProvider        (REST, AWS auth)
    |
SpeechProviderRegistry      (register / get / list / listByTier)
    |
speech-playback.ts          (provider-agnostic play/stop/cache)
    |
SpeechTab.tsx               (Settings UI, Solid.js)
  • Provider InterfacegetVoices(), synthesize(), stop(), testConnection()
  • Capability Gating — UI controls appear/hide based on provider.capabilities
  • CSP Whitelisted — All provider endpoints added to webview connect-src
  • 95 Unit Tests — Registry, browser provider, azure provider, text filter + sentiment

Try It

Quick Install (VSIX)

# Download the VSIX from this fork's releases, then:
code --install-extension kilo-code-7.2.5.vsix

Build From Source

git clone https://github.com/AiDave71/kilocode.git
cd kilocode
bun install
cd packages/kilo-vscode
node esbuild.js        # builds 5 bundles
npx @vscode/vsce package --no-dependencies
code --install-extension kilo-code-7.2.5.vsix

First Run

  1. Open Kilo Code Settings
  2. Click the Speech tab in the sidebar
  3. It defaults to Browser (Web Speech API) — works immediately
  4. Enable Speech, click a play button next to any voice to preview
  5. Optional: select a cloud provider and enter an API key for neural voices

Contributing

This feature is submitted as PR #8839 to the upstream Kilo Code repository. Feedback, testing, and reviews are welcome!

  • Branch: feat/azure-voice-studio
  • Tests: bun test in packages/kilo-vscode — 95 tests across 4 files
  • Lint: 0 errors across 14 speech-related files
  • Build: 5 esbuild bundles, 0 errors

Original Kilo Code README

About Kilo Code

Kilo is the all-in-one agentic engineering platform. Build, ship, and iterate faster with the most popular open source coding agent.

  • Generate code from natural language
  • Checks its own work
  • Run terminal commands
  • Automate the browser
  • Inline autocomplete suggestions
  • Latest AI models
  • API keys optional

Quick Links

License

This project is licensed under the MIT License. See License.

Where did Kilo CLI come from?

Kilo CLI is a fork of OpenCode, enhanced to work within the Kilo agentic engineering platform.

About

Kilo is the all-in-one agentic engineering platform. Build, ship, and iterate faster with the most popular open source coding agent. #1 coding agent on OpenRouter. 1.5M+ Kilo Coders. 25T+ tokens processed

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • TypeScript 91.7%
  • Kotlin 3.6%
  • CSS 3.0%
  • Rust 0.5%
  • Python 0.4%
  • JavaScript 0.3%
  • Other 0.5%