feat(voice): add native macOS `say` TTS engine option by camerondgray · Pull Request #1315 · danielmiessler/Personal_AI_Infrastructure

camerondgray · 2026-05-29T15:42:28Z

🗣️ feat(voice): add native macOS `say` TTS engine option

Summary

Adds an opt-in voice_engine setting to the Pulse voice module so PAI can speak through the native macOS say binary instead of the ElevenLabs cloud API. Setting voice_engine = "say" makes all voice notifications free, fully offline, and key-free. Default behavior is unchanged.

🎯 Motivation and Context

Problem:

ElevenLabs TTS is metered — every notification costs credits, and hitting the quota cap silently breaks voice (a 401 with no fallback).
Privacy: every notification's text is POSTed to api.elevenlabs.io. For a personal infrastructure tool, sending all assistant speech to a third party is undesirable for many users.
It's currently the only engine — there is no offline/local option.

Solution:
Introduce a voice_engine setting ("elevenlabs" | "say"). The "say" engine routes TTS through /usr/bin/say — no API key, no network, nothing leaves the machine. ElevenLabs stays the default, so existing installs are unaffected.

📋 Changes

Single file: Releases/v5.0.0/.claude/PAI/PULSE/VoiceServer/voice.ts

VoiceConfig gains voice_engine?: "elevenlabs" | "say" and default_say_voice?.
VoiceEntry gains sayVoice? so each daidentity.voices.<name> can map to a macOS voice.
New speakWithSay() + sayRateFromSpeed(); playAudio() refactored to share a playAudioFile() helper so the say path keeps the existing per-voice volume control.
sendNotification() branches on the engine. The ElevenLabs path and the /notify HTTP contract are unchanged, so existing callers need no edits.
startVoice() resolves the engine (config → PAI_VOICE_ENGINE env → default elevenlabs), warns on an unrecognized value, and only requires an API key for the ElevenLabs engine.
voiceHealth() reports the active engine.

⚙️ Usage

# PULSE.toml
[voice]
enabled = true
voice_engine = "say"            # "elevenlabs" (default) | "say"
default_say_voice = "Samantha"  # optional; omit to use the system voice

Or via environment: PAI_VOICE_ENGINE=say (and optionally PAI_SAY_VOICE=Samantha).
Optional per-voice mapping in settings.json:

"daidentity": { "voices": { "main": { "voiceId": "…", "sayVoice": "Samantha" } } }

Run say -v '?' to list installed voices; higher-quality voices can be added in System Settings → Accessibility → Spoken Content.

✅ Benefits

Free — no ElevenLabs usage/credits.
Private — notification text never leaves the machine.
Resilient — no API key or quota dependency, and no network round-trip (lower latency).
Non-breaking — default stays ElevenLabs; existing configs and all /notify callers are unaffected.

🧪 How Has This Been Tested?

voice_engine="say" speaks via /notify → handleVoiceRequest with no API key set (system voice and a named "Samantha" voice both verified audibly)
Default (unset) still resolves to elevenlabs — confirmed via voiceHealth()
PAI_VOICE_ENGINE=say environment selection works
Unrecognized engine value logs a warning and falls back to elevenlabs
Per-voice volume preserved (say → AIFF → afplay -v); pronunciation preprocessing applied on both engines

📊 Types of Changes

New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

✅ Checklist

My code follows the PAI code style
I have tested this change thoroughly
This change is backward compatible
No shell interpolation — say is invoked via spawn() with an argument array, and -- terminates option parsing so message text can never be treated as flags (consistent with the execSync→execFileSync hardening in security: replace execSync with execFileSync in tab-setter.ts #1046)

🖥️ Platform

macOS only (say is a macOS binary). On non-macOS hosts the engine should be left at the elevenlabs default; selecting say there fails gracefully through the existing voice-error path.

… no API key) Adds an opt-in `voice_engine` config ("elevenlabs" | "say") to the Pulse voice module. Setting it to "say" routes all TTS through the native macOS `say` binary instead of the ElevenLabs cloud API. Why: - Cost: ElevenLabs usage is metered; `say` is free. - Privacy: notification text currently POSTs to api.elevenlabs.io. With the `say` engine nothing leaves the machine — fully offline. - Resilience: no API key or quota dependency. Behavior: - Default is unchanged ("elevenlabs") — fully non-breaking. - Selectable via PULSE.toml [voice] voice_engine, or the PAI_VOICE_ENGINE env var. - Per-voice macOS voice via daidentity.voices.<name>.sayVoice, or a default_say_voice / PAI_SAY_VOICE fallback; omit for the system voice. - ElevenLabs `speed` maps to a `say` words-per-minute rate (clamped 100-320). - Pronunciation preprocessing and per-voice volume both still apply. - The /notify HTTP contract is unchanged, so existing callers need no edits. Security: `say` is invoked via spawn() with an argument array (no shell), and `--` terminates option parsing so message text can never be treated as flags. Testing: drove /notify through handleVoiceRequest with voice_engine="say" and confirmed audio plays with no API key; verified default stays "elevenlabs" and an unrecognized engine value warns and falls back. macOS only.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(voice): add native macOS `say` TTS engine option#1315

feat(voice): add native macOS `say` TTS engine option#1315
camerondgray wants to merge 1 commit into
danielmiessler:mainfrom
camerondgray:feat/native-macos-say-tts-engine

camerondgray commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

camerondgray commented May 29, 2026