{"id":"d7698433-6408-48df-8232-e5258d36e797","shortId":"eJDrVp","kind":"skill","title":"speak-tts","tagline":"Give your agent the ability to speak to you real-time. Talk to your Claude! Local TTS, text-to-speech, voice synthesis, audio generation with voice cloning on Apple Silicon. Use for reading articles aloud, audiobook narration, or voice responses. Runs entirely on-device via MLX -","description":"# speak - Talk to your Claude!\n\nGive your agent the ability to speak to you real-time. Local text-to-speech, voice cloning, and audio generation on Apple Silicon.\nGive your agent the ability to speak to you real-time. Local TTS with voice cloning on Apple Silicon.\n\n## Prerequisites\n\n| Requirement | Check | Install |\n|-------------|-------|---------|\n| Apple Silicon Mac | `uname -m` → arm64 | Intel not supported |\n| macOS 12.0+ | `sw_vers` | - |\n| sox | `which sox` | `brew install sox` |\n| ffmpeg | `which ffmpeg` | `brew install ffmpeg` |\n| poppler (PDF) | `which pdftotext` | `brew install poppler` |\n\n## Input Sources\n\n| Source | Example |\n|--------|---------|\n| Text file | `speak article.txt` |\n| Markdown | `speak doc.md` |\n| Direct string | `speak \"Hello\"` |\n| Clipboard | `pbpaste \\| speak` |\n| Stdin | `cat file.txt \\| speak` |\n\n### Web Articles\n```bash\nlynx -dump -nolist \"https://example.com/article\" | speak --output article.wav\n```\n\n### Converting Formats\n\n| Format | Convert Command |\n|--------|-----------------|\n| PDF | `pdftotext doc.pdf doc.txt` |\n| DOCX | `textutil -convert txt doc.docx` |\n| HTML | `pandoc -f html -t plain doc.html > doc.txt` |\n\n## Output Modes\n\n| Goal | Command |\n|------|---------|\n| Save for later | `speak text.txt --output file.wav` |\n| Listen now (streaming) | `speak text.txt --stream` |\n| Listen now (complete) | `speak text.txt --play` |\n| Both | `speak text.txt --stream --output file.wav` |\n\n### Default Behavior\n```bash\nspeak article.txt          # → ~/Audio/speak/article.wav (no playback)\nspeak \"Hello\"              # → ~/Audio/speak/speak_<timestamp>.wav\n```\n\n## Directory Auto-Creation\n\n| Directory | Auto-Created? |\n|-----------|---------------|\n| `~/Audio/speak/` | ✓ Yes |\n| `~/.chatter/voices/` | ✗ No |\n| Custom directories | ✗ No |\n\n**Always create custom directories first:**\n```bash\nmkdir -p ~/.chatter/voices/\nmkdir -p ~/Audio/custom/\n```\n\n## Voice Cloning\n\nVoice cloning generates speech that matches your vocal characteristics (pitch, tone, cadence) from a short recording.\n\n### Quality Expectations\n- Output captures general voice characteristics but is **not a perfect replica**\n- Quality depends heavily on sample quality\n- 15-25 seconds is optimal (10s minimum, 30s maximum)\n\n### Recording Your Voice\n\n**Using QuickTime:**\n1. Open QuickTime Player → File → New Audio Recording\n2. Record 20 seconds of clear speech\n3. File → Export As → Audio Only (.m4a)\n4. Convert to WAV (see below)\n\n**Using sox (command line):**\n```bash\n# -d = use default microphone\n# Recording starts immediately and stops after 25 seconds\nsox -d -r 24000 -c 1 ~/.chatter/voices/my_voice.wav trim 0 25\n```\n\n### Converting to Required Format\n\nVoice samples **MUST** be: WAV, 24000 Hz, mono, 10-30 seconds.\n\n```bash\n# From MP3\nffmpeg -i voice.mp3 -ar 24000 -ac 1 voice.wav\n\n# From M4A (QuickTime)\nffmpeg -i voice.m4a -ar 24000 -ac 1 voice.wav\n\n# Trim to 25 seconds\nffmpeg -i long.wav -t 25 -ar 24000 -ac 1 trimmed.wav\n\n# Check sample properties\nffprobe -i voice.wav 2>&1 | grep -E \"Duration|Stream\"\n# Should show: Duration ~15-25s, 24000 Hz, mono\n```\n\n### Using Your Voice\n\n```bash\n# Create directory\nmkdir -p ~/.chatter/voices/\n\n# Move sample\nmv voice.wav ~/.chatter/voices/my_voice.wav\n\n# Test\nspeak \"Testing my voice\" --voice ~/.chatter/voices/my_voice.wav --stream\n\n# Use for content\nspeak notes.txt --voice ~/.chatter/voices/my_voice.wav --output presentation.wav\n```\n\n**Path requirements:**\n- ✓ Works: `~/.chatter/voices/my_voice.wav` (tilde expanded by shell)\n- ✓ Works: `/Users/name/.chatter/voices/my_voice.wav`\n- ✗ Fails: `my_voice.wav` (relative path)\n- ✗ Fails: `./voices/my_voice.wav` (relative path)\n\n### Voice Sample Tips\n\n| Good Sample | Bad Sample |\n|-------------|------------|\n| Quiet room | Background noise |\n| Natural pace | Rushed or monotone |\n| Clear diction | Mumbling |\n| Varied content | Repetitive phrases |\n\n## Default Voice\n\nWhen `--voice` is omitted, a built-in default voice is used:\n```bash\nspeak \"Hello world\" --stream  # Uses default voice\n```\n\n## Emotion Tags\n\nTags produce **audible effects** (actual sounds), not spoken words:\n\n```bash\nspeak \"[sigh] Monday again.\" --stream\n# Output: (sigh sound) \"Monday again.\"\n```\n\n| Tag | Effect |\n|-----|--------|\n| `[laugh]` | Laughter |\n| `[chuckle]` | Light chuckle |\n| `[sigh]` | Sighing |\n| `[gasp]` | Gasping |\n| `[groan]` | Groaning |\n| `[clear throat]` | Throat clearing |\n| `[cough]` | Coughing |\n| `[crying]` | Crying |\n| `[singing]` | Sung speech |\n\n**NOT supported:** `[pause]`, `[whisper]` (ignored)\n\n**For pauses:** Use punctuation: `\"Wait... let me think.\"`\n\n## Batch Processing\n\n```bash\nmkdir -p ~/Audio/book/\nspeak ch01.txt ch02.txt ch03.txt --output-dir ~/Audio/book/\n# Creates: ch01.wav, ch02.wav, ch03.wav\n\n# With auto-chunking (for long files)\nspeak chapters/*.txt --output-dir ~/Audio/book/ --auto-chunk\n\n# Skip completed files\nspeak chapters/*.txt --output-dir ~/Audio/book/ --skip-existing\n```\n\n### Auto-Chunk Behavior\n\nWhen using `--auto-chunk` with batch processing:\n1. Each input file is chunked **independently**\n2. Chunks are generated and **automatically concatenated** per file\n3. Final output: one `.wav` per input file (e.g., `ch01.wav`)\n4. Intermediate chunks deleted (unless `--keep-chunks`)\n\n**You don't need to manually concatenate chunks** — only concatenate final chapter files.\n\n## Concatenating Audio\n\n```bash\n# Explicit order (recommended)\nspeak concat ch01.wav ch02.wav ch03.wav --output book.wav\n\n# Glob pattern (REQUIRES zero-padded filenames)\nspeak concat audiobook/*.wav --output book.wav\n```\n\n### Zero-Padding Rules\n\n**Critical for correct concatenation order:**\n\n| Files | Correct | Wrong |\n|-------|---------|-------|\n| 1-9 | `01`, `02`, ..., `09` | `1`, `2`, ..., `9` |\n| 10-99 | `01`, `02`, ..., `99` | `1`, `10`, `2`, ... |\n| 100+ | `001`, `002`, ..., `999` | `1`, `100`, `2`, ... |\n\n**Why:** Shell glob expansion sorts alphabetically. `1, 10, 2` vs `01, 02, 10`.\n\n## PDF to Audiobook (Complete Workflow)\n\n### Step 1: Find Chapter Boundaries\n```bash\n# Preview table of contents\npdftotext -f 1 -l 5 textbook.pdf toc.txt\ncat toc.txt  # Note chapter page numbers\n\n# Or search for \"Chapter\" markers\npdftotext textbook.pdf - | grep -n \"Chapter\"\n```\n\n### Step 2: Extract Chapters (Zero-Padded!)\n```bash\n# For 100-page book with ~10 chapters\npdftotext -f 1 -l 12 -layout textbook.pdf ch01.txt\npdftotext -f 13 -l 25 -layout textbook.pdf ch02.txt\npdftotext -f 26 -l 38 -layout textbook.pdf ch03.txt\n# ... continue for all chapters\n```\n\n### Step 3: Estimate Time\n```bash\nspeak --estimate ch*.txt\n# Shows: total audio duration, generation time, storage needed\n\n# Quick estimates:\n# 1 page ≈ 2 min audio ≈ 1 min generation\n# 100 pages ≈ 200 min audio ≈ 100 min generation ≈ 500 MB\n```\n\n### Step 4: Generate Audio\n```bash\nmkdir -p audiobook/\nspeak ch01.txt ch02.txt ch03.txt --output-dir audiobook/ --auto-chunk\n# Creates: audiobook/ch01.wav, audiobook/ch02.wav, audiobook/ch03.wav\n```\n\n### Step 5: Concatenate\n```bash\nspeak concat audiobook/ch01.wav audiobook/ch02.wav audiobook/ch03.wav --output complete_audiobook.wav\n# Or with glob (only if zero-padded):\nspeak concat audiobook/ch*.wav --output complete_audiobook.wav\n```\n\n### PDF Troubleshooting\n\n| Issue | Solution |\n|-------|----------|\n| Empty/garbled text | Scanned PDF — use OCR: `brew install tesseract` |\n| Wrong encoding | Try: `pdftotext -enc UTF-8 doc.pdf` |\n| Check word count | `pdftotext doc.pdf - \\| wc -w` (should be >100) |\n\n## Multi-Voice Content\n\n```bash\nmkdir -p podcast/scripts podcast/wav\n\necho \"Welcome to the show.\" > podcast/scripts/01_host.txt\necho \"Thanks for having me.\" > podcast/scripts/02_guest.txt\n\nspeak podcast/scripts/01_host.txt --voice ~/.chatter/voices/host.wav --output podcast/wav/01.wav\nspeak podcast/scripts/02_guest.txt --voice ~/.chatter/voices/guest.wav --output podcast/wav/02.wav\n\nspeak concat podcast/wav/01.wav podcast/wav/02.wav --output podcast.wav\n```\n\n## Options Reference\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `--stream` | Stream as it generates | false |\n| `--play` | Play after complete | false |\n| `--output <path>` | Output file | ~/Audio/speak/ |\n| `--output-dir <dir>` | Batch output directory | - |\n| `--voice <path>` | Voice sample (full path) | default |\n| `--timeout <sec>` | Timeout per file | 300 |\n| `--auto-chunk` | Split long documents | false |\n| `--chunk-size <n>` | Chars per chunk | 6000 |\n| `--resume <file>` | Resume from manifest | - |\n| `--keep-chunks` | Keep intermediate files | false |\n| `--skip-existing` | Skip if output exists | false |\n| `--estimate` | Show duration estimate | false |\n| `--dry-run` | Preview only | false |\n| `--quiet` | Suppress output | false |\n\n## Commands\n\n| Command | Description |\n|---------|-------------|\n| `speak setup` | Set up environment |\n| `speak health` | Check system status |\n| `speak models` | List TTS models |\n| `speak concat` | Concatenate audio |\n| `speak daemon kill` | Stop TTS server |\n| `speak config` | Show configuration |\n\n## Performance\n\n| Metric | Value |\n|--------|-------|\n| Cold start | ~4-8s |\n| Warm start | ~3-8s |\n| Speed | 0.3-0.5x RTF (faster than real-time) |\n| Storage | ~2.5 MB/min, ~150 MB/hour |\n\n## Resume Capability\n\nFor interrupted long generations:\n\n```bash\n# Single file with auto-chunk — use --resume\nspeak long.txt --auto-chunk --output book.wav\n# If interrupted, manifest saved at ~/Audio/speak/manifest.json\nspeak --resume ~/Audio/speak/manifest.json\n\n# Batch processing — use --skip-existing\nspeak ch*.txt --output-dir audiobook/ --auto-chunk\n# If interrupted, re-run same command:\nspeak ch*.txt --output-dir audiobook/ --auto-chunk --skip-existing\n```\n\n## Common Errors\n\n| Error | Cause | Solution |\n|-------|-------|----------|\n| \"Voice file not found\" | Relative path | Use full path: `~/.chatter/voices/x.wav` |\n| \"Invalid WAV format\" | Wrong specs | Convert: `ffmpeg -i in.wav -ar 24000 -ac 1 out.wav` |\n| \"Voice sample too short\" | <10 seconds | Record 15-25 seconds |\n| \"Output directory doesn't exist\" | Not created | `mkdir -p dirname/` |\n| \"sox not found\" | Not installed | `brew install sox` |\n| Scrambled concat order | Non-zero-padded | Use `01`, `02`, not `1`, `2` |\n| Timeout | >5 min generation | Use `--auto-chunk` or `--timeout 600` |\n| \"Server not running\" | Stale daemon | `speak daemon kill && speak health` |\n\n## Setup\n\n```bash\nspeak \"test\"     # Auto-setup on first run (downloads model ~500MB)\nspeak setup      # Or manual setup\nspeak health     # Verify everything works\n```\n\n## Server Management\n\nServer auto-starts and shuts down after 1 hour idle.\n\n```bash\nspeak health        # Check status\nspeak daemon kill   # Stop manually\n```","tags":["speak","emzod","agent-skills","apple-silicon","chatterbox","cli","skills","text-to-speech","tts","voice-cloning"],"capabilities":["skill","source-emzod","skill-speak","topic-agent-skills","topic-apple-silicon","topic-chatterbox","topic-cli","topic-skills","topic-text-to-speech","topic-tts","topic-voice-cloning"],"categories":["speak"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/EmZod/speak","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add EmZod/speak","source_repo":"https://github.com/EmZod/speak","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 6 github stars · SKILL.md body (10,664 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T13:22:58.794Z","embedding":null,"createdAt":"2026-05-18T13:22:58.794Z","updatedAt":"2026-05-18T13:22:58.794Z","lastSeenAt":"2026-05-18T13:22:58.794Z","tsv":"'-0.5':1143 '-25':301,436,1260 '-30':382 '-8':960,1134,1139 '-9':740 '-99':748 '/.chatter/voices':246,259,449 '/.chatter/voices/guest.wav':1002 '/.chatter/voices/host.wav':996 '/.chatter/voices/my_voice.wav':365,454,461,469,475 '/.chatter/voices/x.wav':1237 '/article':169 '/audio/book':599,607,625,638 '/audio/custom':262 '/audio/speak':244,1030 '/audio/speak/article.wav':229 '/audio/speak/manifest.json':1183,1186 '/audio/speak/speak_':234 '/users/name/.chatter/voices/my_voice.wav':481 '/voices/my_voice.wav':487 '0':367 '0.3':1142 '001':756 '002':757 '01':741,749,772,1288 '02':742,750,773,1289 '09':743 '1':314,364,393,404,418,427,654,739,744,752,759,768,781,792,830,875,880,1250,1291,1347 '10':381,747,753,769,774,826,1256 '100':755,760,822,883,888,971 '10s':305 '12':832 '12.0':117 '13':838 '15':300,435,1259 '150':1154 '2':322,426,661,745,754,761,770,814,877,1292 '2.5':1152 '20':324 '200':885 '24000':362,378,391,402,416,438,1248 '25':357,368,408,414,840 '26':846 '3':329,670,857,1138 '300':1047 '30s':307 '38':848 '4':336,680,894,1133 '5':794,917,1294 '500':891 '500mb':1326 '600':1303 '6000':1061 '9':746 '99':751 '999':758 'abil':8,62,87 'ac':392,403,417,1249 'actual':541 'agent':6,60,85 'aloud':40 'alphabet':767 'alway':251 'appl':34,81,101,107 'ar':390,401,415,1247 'arm64':112 'articl':39,162 'article.txt':146,228 'article.wav':172 'audibl':539 'audio':28,78,320,333,702,867,879,887,896,1117 'audiobook':41,723,777,900,908,1199,1216 'audiobook/ch':937 'audiobook/ch01.wav':913,922 'audiobook/ch02.wav':914,923 'audiobook/ch03.wav':915,924 'auto':238,242,614,627,643,649,910,1049,1167,1174,1201,1218,1299,1319,1341 'auto-chunk':613,626,642,648,909,1048,1166,1173,1200,1217,1298 'auto-cr':237,241 'auto-setup':1318 'auto-start':1340 'automat':666 'background':499 'bad':495 'bash':163,226,256,346,384,444,527,546,596,703,785,820,860,897,919,976,1162,1315,1350 'batch':594,652,1034,1187 'behavior':225,645 'book':824 'book.wav':713,726,1177 'boundari':784 'brew':123,129,136,951,1277 'built':521 'built-in':520 'c':363 'cadenc':276 'capabl':1157 'captur':284 'cat':158,797 'caus':1226 'ch':863,1194,1211 'ch01.txt':601,835,902 'ch01.wav':609,679,709 'ch02.txt':602,843,903 'ch02.wav':610,710 'ch03.txt':603,851,904 'ch03.wav':611,711 'chapter':620,633,699,783,800,806,812,816,827,855 'char':1058 'characterist':273,287 'check':105,420,962,1106,1353 'chuckl':561,563 'chunk':615,628,644,650,659,662,682,687,695,911,1050,1056,1060,1068,1168,1175,1202,1219,1300 'chunk-siz':1055 'claud':19,57 'clear':327,506,570,573 'clipboard':154 'clone':32,76,99,264,266 'cold':1131 'command':177,198,344,1096,1097,1209 'common':1223 'complet':214,630,778,1025 'complete_audiobook.wav':926,940 'concat':708,722,921,936,1006,1115,1281 'concaten':667,694,697,701,734,918,1116 'config':1125 'configur':1127 'content':465,510,789,975 'continu':852 'convert':173,176,184,337,369,1243 'correct':733,737 'cough':574,575 'count':964 'creat':243,252,445,608,912,1268 'creation':239 'cri':576,577 'critic':731 'custom':248,253 'd':347,360 'daemon':1119,1308,1310,1356 'default':224,349,513,523,533,1015,1042 'delet':683 'depend':295 'descript':1014,1098 'devic':50 'diction':507 'dir':606,624,637,907,1033,1198,1215 'direct':150 'directori':236,240,249,254,446,1036,1263 'dirnam':1271 'doc.docx':186 'doc.html':193 'doc.md':149 'doc.pdf':180,961,966 'doc.txt':181,194 'document':1053 'docx':182 'doesn':1264 'download':1324 'dri':1087 'dry-run':1086 'dump':165 'durat':430,434,868,1083 'e':429 'e.g':678 'echo':981,987 'effect':540,558 'emot':535 'empty/garbled':945 'enc':958 'encod':955 'entir':47 'environ':1103 'error':1224,1225 'estim':858,862,874,1081,1084 'everyth':1335 'exampl':142 'example.com':168 'example.com/article':167 'exist':641,1075,1079,1192,1222,1266 'expand':477 'expans':765 'expect':282 'explicit':704 'export':331 'extract':815 'f':189,791,829,837,845 'fail':482,486 'fals':1021,1026,1054,1072,1080,1085,1091,1095 'faster':1146 'ffmpeg':126,128,131,387,398,410,1244 'ffprobe':423 'file':144,318,330,618,631,657,669,677,700,736,1029,1046,1071,1164,1229 'file.txt':159 'file.wav':205,223 'filenam':720 'final':671,698 'find':782 'first':255,1322 'format':174,175,372,1240 'found':1231,1274 'full':1040,1235 'gasp':566,567 'general':285 'generat':29,79,267,664,869,882,890,895,1020,1161,1296 'give':4,58,83 'glob':714,764,929 'goal':197 'good':493 'grep':428,810 'groan':568,569 'health':1105,1313,1333,1352 'heavili':296 'hello':153,233,529 'hour':1348 'html':187,190 'hz':379,439 'idl':1349 'ignor':585 'immedi':353 'in.wav':1246 'independ':660 'input':139,656,676 'instal':106,124,130,137,952,1276,1278 'intel':113 'intermedi':681,1070 'interrupt':1159,1179,1204 'invalid':1238 'issu':943 'keep':686,1067,1069 'keep-chunk':685,1066 'kill':1120,1311,1357 'l':793,831,839,847 'later':201 'laugh':559 'laughter':560 'layout':833,841,849 'let':591 'light':562 'line':345 'list':1111 'listen':206,212 'local':20,70,95 'long':617,1052,1160 'long.txt':1172 'long.wav':412 'lynx':164 'm':111 'm4a':335,396 'mac':109 'maco':116 'manag':1338 'manifest':1065,1180 'manual':693,1330,1359 'markdown':147 'marker':807 'match':270 'maximum':308 'mb':892 'mb/hour':1155 'mb/min':1153 'metric':1129 'microphon':350 'min':878,881,886,889,1295 'minimum':306 'mkdir':257,260,447,597,898,977,1269 'mlx':52 'mode':196 'model':1110,1113,1325 'monday':549,555 'mono':380,440 'monoton':505 'move':450 'mp3':386 'multi':973 'multi-voic':972 'mumbl':508 'must':375 'mv':452 'my_voice.wav':483 'n':811 'narrat':42 'natur':501 'need':691,872 'new':319 'nois':500 'nolist':166 'non':1284 'non-zero-pad':1283 'note':799 'notes.txt':467 'number':802 'ocr':950 'omit':518 'on-devic':48 'one':673 'open':315 'optim':304 'option':1011,1013 'order':705,735,1282 'out.wav':1251 'output':171,195,204,222,283,470,552,605,623,636,672,712,725,906,925,939,997,1003,1009,1027,1028,1032,1035,1078,1094,1176,1197,1214,1262 'output-dir':604,622,635,905,1031,1196,1213 'p':258,261,448,598,899,978,1270 'pace':502 'pad':719,729,819,934,1286 'page':801,823,876,884 'pandoc':188 'path':472,485,489,1041,1233,1236 'pattern':715 'paus':583,587 'pbpast':155 'pdf':133,178,775,941,948 'pdftotext':135,179,790,808,828,836,844,957,965 'per':668,675,1045,1059 'perfect':292 'perform':1128 'phrase':512 'pitch':274 'plain':192 'play':217,1022,1023 'playback':231 'player':317 'podcast.wav':1010 'podcast/scripts':979 'podcast/scripts/01_host.txt':986,994 'podcast/scripts/02_guest.txt':992,1000 'podcast/wav':980 'podcast/wav/01.wav':998,1007 'podcast/wav/02.wav':1004,1008 'poppler':132,138 'prerequisit':103 'presentation.wav':471 'preview':786,1089 'process':595,653,1188 'produc':538 'properti':422 'punctuat':589 'qualiti':281,294,299 'quick':873 'quicktim':313,316,397 'quiet':497,1092 'r':361 're':1206 're-run':1205 'read':38 'real':14,68,93,1149 'real-tim':13,67,92,1148 'recommend':706 'record':280,309,321,323,351,1258 'refer':1012 'relat':484,488,1232 'repetit':511 'replica':293 'requir':104,371,473,716 'respons':45 'resum':1062,1063,1156,1170,1185 'room':498 'rtf':1145 'rule':730 'run':46,1088,1207,1306,1323 'rush':503 'sampl':298,374,421,451,491,494,496,1039,1253 'save':199,1181 'scan':947 'scrambl':1280 'search':804 'second':302,325,358,383,409,1257,1261 'see':340 'server':1123,1304,1337,1339 'set':1101 'setup':1100,1314,1320,1328,1331 'shell':479,763 'short':279,1255 'show':433,865,985,1082,1126 'shut':1344 'sigh':548,553,564,565 'silicon':35,82,102,108 'sing':578 'singl':1163 'size':1057 'skill' 'skill-speak' 'skip':629,640,1074,1076,1191,1221 'skip-exist':639,1073,1190,1220 'solut':944,1227 'sort':766 'sound':542,554 'sourc':140,141 'source-emzod' 'sox':120,122,125,343,359,1272,1279 'speak':2,10,53,64,89,145,148,152,156,160,170,202,209,215,219,227,232,456,466,528,547,600,619,632,707,721,861,901,920,935,993,999,1005,1099,1104,1109,1114,1118,1124,1171,1184,1193,1210,1309,1312,1316,1327,1332,1351,1355 'speak-tt':1 'spec':1242 'speech':25,74,268,328,580 'speed':1141 'split':1051 'spoken':544 'stale':1307 'start':352,1132,1137,1342 'status':1108,1354 'stdin':157 'step':780,813,856,893,916 'stop':355,1121,1358 'storag':871,1151 'stream':208,211,221,431,462,531,551,1016,1017 'string':151 'sung':579 'support':115,582 'suppress':1093 'sw':118 'synthesi':27 'system':1107 'tabl':787 'tag':536,537,557 'talk':16,54 'tesseract':953 'test':455,457,1317 'text':23,72,143,946 'text-to-speech':22,71 'text.txt':203,210,216,220 'textbook.pdf':795,809,834,842,850 'textutil':183 'thank':988 'think':593 'throat':571,572 'tild':476 'time':15,69,94,859,870,1150 'timeout':1043,1044,1293,1302 'tip':492 'toc.txt':796,798 'tone':275 'topic-agent-skills' 'topic-apple-silicon' 'topic-chatterbox' 'topic-cli' 'topic-skills' 'topic-text-to-speech' 'topic-tts' 'topic-voice-cloning' 'total':866 'tri':956 'trim':366,406 'trimmed.wav':419 'troubleshoot':942 'tts':3,21,96,1112,1122 'txt':185,621,634,864,1195,1212 'unam':110 'unless':684 'use':36,312,342,348,441,463,526,532,588,647,949,1169,1189,1234,1287,1297 'utf':959 'valu':1130 'vari':509 'ver':119 'verifi':1334 'via':51 'vocal':272 'voic':26,31,44,75,98,263,265,286,311,373,443,459,460,468,490,514,516,524,534,974,995,1001,1037,1038,1228,1252 'voice.m4a':400 'voice.mp3':389 'voice.wav':394,405,425,453 'vs':771 'w':968 'wait':590 'warm':1136 'wav':235,339,377,674,724,938,1239 'wc':967 'web':161 'welcom':982 'whisper':584 'word':545,963 'work':474,480,1336 'workflow':779 'world':530 'wrong':738,954,1241 'x':1144 'yes':245 'zero':718,728,818,933,1285 'zero-pad':717,727,817,932","prices":[{"id":"2b71e916-59a1-4b16-84f4-46bf3aa5c7d3","listingId":"d7698433-6408-48df-8232-e5258d36e797","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"EmZod","category":"speak","install_from":"skills.sh"},"createdAt":"2026-05-18T13:22:58.794Z"}],"sources":[{"listingId":"d7698433-6408-48df-8232-e5258d36e797","source":"github","sourceId":"EmZod/speak","sourceUrl":"https://github.com/EmZod/speak","isPrimary":false,"firstSeenAt":"2026-05-18T13:22:58.794Z","lastSeenAt":"2026-05-18T13:22:58.794Z"}],"details":{"listingId":"d7698433-6408-48df-8232-e5258d36e797","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"EmZod","slug":"speak","github":{"repo":"EmZod/speak","stars":6,"topics":["agent-skills","ai","apple-silicon","chatterbox","cli","skills","text-to-speech","tts","voice-cloning"],"license":null,"html_url":"https://github.com/EmZod/speak","pushed_at":"2026-01-28T09:11:59Z","description":"A fast CLI tool for Agents to convert their text output to speech using Chatterbox TTS on Apple Silicon. Agent SKILL files included.","skill_md_sha":"50bdfbbbd906a6c38abca088c1b1d3d90d5037fc","skill_md_path":"SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/EmZod/speak"},"layout":"root","source":"github","category":"speak","frontmatter":{"name":"speak-tts","description":"Give your agent the ability to speak to you real-time. Talk to your Claude! Local TTS, text-to-speech, voice synthesis, audio generation with voice cloning on Apple Silicon. Use for reading articles aloud, audiobook narration, or voice responses. Runs entirely on-device via MLX - private, no API keys."},"skills_sh_url":"https://skills.sh/EmZod/speak"},"updatedAt":"2026-05-18T13:22:58.794Z"}}