Speech & Transcription

21 skills in this category

assemblyai-transcribe

Safe

Transcribe audio/video with AssemblyAI (local upload.

@tristanmanchesterskill

audio-gen

Safe

Generate audiobooks, podcasts, or educational audio content on demand.

@udiedrichsenskill

audio-reply

Caution

Generate audio replies using TTS. Trigger with "read it to me [URL]" to fetch.

@matrixyskill

edge-tts

Safe

@i3130002skill

gettr-transcribe-summarize

Safe

Download audio from a GETTR post (via HTML og:video), transcribe it locally.

@kevin37liskill

llmwhisperer

Safe

Extract text and layout from images and PDFs using LLMWhisperer API.

@gumadeirasskill

local-whisper

Safe

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download.

@araa47skill

mlx-whisper

Safe

Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key).

@kevin37liskill

openai-whisper

Safe

Local speech-to-text with the Whisper CLI (no API key).

@steipeteskill

openai-whisper-api

Safe

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

@steipeteskill

parakeet-mlx

Safe

Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon (no API key).

@kylehowellsskill

parakeet-stt

Safe

>-.

@carlulsoeskill

pocket-transcripts

Safe

Read transcripts and summaries from Pocket AI (heypocket.com) recording devices.

@tmustierskill

pocket-tts

Safe

pocket-tts

@sherajdevskill

tts-whatsapp

Safe

Send high-quality text-to-speech voice messages on WhatsApp in 40+ languages with automatic delivery.

@Communityskill

video-subtitles

Safe

Generate SRT subtitles from video/audio with translation support.

@ngutmanskill

voice-transcribe

Safe

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints.

@darinkishoreskill

elevenlabs-voices

Safe

ElevenLabs voice synthesis: 18 personas, 32 languages, sound effects.

@robbyczgw-claskill

elevenlabs-media

Safe

ElevenLabs music generation and speech-to-text (Scribe v2).

@unknownskill

elevenlabs-agents

Safe

Create and manage ElevenLabs conversational AI agents.

@pennyroyalteaskill

tts

Safe

Text-to-speech using Hume AI or OpenAI API.

@amstkoskill