Skip to content

feat: Add a voice input mode #418

@ercbot

Description

@ercbot

The speed at which the AI can work means that human typing speeds might be the slow down for conveying prompts. People already use 3rd party transcription apps like SuperWhisper to convey instructions quickly (see the original vibe coding tweet).

But I think this would be a great feature for codex, especially since there is 1st party support for realtime transcription now via the OpenAI Realtime API.

https://platform.openai.com/docs/guides/realtime-transcription#handling-transcriptions

It would work something like:
codex --voice-mode

(colloquially --vibe-mode)

It would start an interactive session same as it does now but with a persistent connection to the realtime API and you can use a push to talk key (Space by default?, but configurable) to provide input / interrupt the AI

Willing to work on a PR for this

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions