Skillquality 0.62

25-voice-clone-podcast

Audio AI cho personal brand — voice clone (ElevenLabs/HeyGen Voice/Vbee), podcast workflow, audiobook, voiceover video. 3 use case: voiceover ngan TikTok/Reels (energetic), podcast 30-60 phut (conversational), audiobook (mid-tempo). Repurpose 1:10 (1 podcast → 10 short clip). Tri

Price
free
Protocol
skill
Verified
no

What it does

Voice Clone & Podcast — Audio AI cho Personal Brand

Skill nay tap trung vao audio AI — voice clone, podcast, audiobook, voiceover. Bo sung cho 24-ai-avatar-production (video) — ket hop ca 2 de phu het content stack.


1. Cho nguoi moi (Newbie Guide)

Audio AI la gi va khac voi video AI?

Audio AI la cong nghe tao ra giong noi nhan tao gan giong nguoi that — tu sample giong cua ban, AI hoc va tao ra giong nhan ban (voice clone). Ban viet text → AI doc thay (Text-to-Speech).

Khac biet voi video AI:

  • Video AI (skill 24): Tao video co hinh + giong → lam talking head, social video
  • Audio AI (skill nay): Chi tao giong → lam podcast, audiobook, voiceover, narration

Khi nao dung audio AI thay vi quay video?

Tinh huongChon audio AIChon video AI
Noi dung dai (>10 phut)YES — podcast formatNO — too long for video
Khong muon len hinhYESNO
Can tao volume content nhanhYES — 1 podcast = 10 shortYES nhung ton hon
Audience nghe khi lai xe / tap gymYESNO
Can visual de demoNOYES
Personal brand thought leaderYES — podcast = authorityYES — neu da co face brand

Tools chinh

  • ElevenLabs: Tot nhat the gioi cho voice clone — VN voice tot, EN voice xuat sac
  • Vbee: Tot nhat tieng Viet — natural intonation, da giong vung mien
  • HeyGen Voice: Combo voi avatar HeyGen — workflow lien tuc voice + video
  • Descript: AI editing — cat audio bang text, voice clone (Overdub)
  • Riverside: Podcast recording — chat luong studio, AI Magic Clips repurpose

Mat bao lau / chi phi?

Cong viecThoi gianChi phi (USD/thang)
Voice clone setup30-60 phut$5-22 (ElevenLabs Starter/Pro)
Voiceover 60s (TikTok)5-10 phut$5-22
Podcast 30 phut (solo)1-2 gio$22-99 (ElevenLabs + Riverside)
Audiobook 1 chuong (15 phut)30-45 phut$22-99
Repurpose 1 podcast → 10 clip1-2 gio$0-30 (Descript/Opus)

5 loi thuong gap

  1. Giong AI nghe robot: Sample qua ngan hoac don dieu. Fix: thu lai 3-5 phut, doc nhieu cam xuc khac nhau (vui, buon, nghiem tuc).
  2. Phat am sai tu tieng Viet: ElevenLabs van con yeu vai tu Han-Viet. Fix: dung Vbee cho VN content, hoac sua tu bang phonetic spelling.
  3. Audio bi clipping (vo tieng): Levels qua cao. Fix: target -3dB peak, -16 LUFS loudness.
  4. Bi noise/echo: Phong khong tieu am. Fix: thu am phong nho co rem, treo chan, hoac dung NVIDIA Broadcast / Krisp khu noise.
  5. Podcast nghe chan: Khong co edit, qua nhieu "um a". Fix: Descript auto-remove filler words, them background music nhe (-20dB).

2. Thu thap thong tin

Hoi toi da 4 cau truoc khi bat dau:

  1. Use case chinh? Voiceover ngan (TikTok/Reels) / Podcast 30-60 phut / Audiobook?
  2. Ngon ngu? Tieng Viet / Tieng Anh / Song ngu (VN-EN)?
  3. Thoi luong tong? <60s / 5-30 phut / 30-60 phut / >60 phut (audiobook)?
  4. Ngan sach tier? Free ($0) / Starter ($5-22) / Pro ($22-99) / Business ($99+)?

Dua tren 4 cau tra loi, chon use case + tool stack phu hop.


3. Voice clone setup

Yeu cau sample

Tieu chiYeu cau toi thieuToi uu
Thoi luong1 phut (Free tier)3-5 phut (Pro tier)
PhongYen tinh, khong vangTreo chan, rem, sach hap thu am
MiciPhone + tai nghe co micCondenser mic (AT2020, $80-100)
Distance20-30cm15-20cm voi pop filter
FormatMP3 128kbpsWAV 44.1kHz
Noi dung1 doan van da chuan bi3 doan van: business / casual / emotional

Reference day du: references/voice-clone-prompts-vn.md — 3 sample script theo vung giong (Bac/Trung/Nam) va 3 topic (business/lifestyle/educational).

Tool comparison

ToolVN voice cloneGia/thangSetup timeBest for
ElevenLabs ProTot (8/10)$2230 phutMulti-language, content creator
HeyGen VoiceTrung binh (6/10)Bundle voi avatar15 phutCombo voi video AI
Vbee ProXuat sac (9.5/10)199K-499K VND45 phutVN-only, broadcast TTS
Descript OverdubTrung binh (6/10)$24 (Hobbyist)30 phutPodcast editing
Resemble.aiTrung binh (7/10)$301 gioAPI integration, custom

Khuyen nghi:

  • VN-only content: Vbee Pro (tot nhat phat am tieng Viet)
  • Multi-lang (VN + EN): ElevenLabs Pro
  • Combo voi video: HeyGen (1 platform — voice + avatar)

Consent form template

THOA THUAN SU DUNG VOICE CLONE

Toi, [Ho ten], CMND/CCCD: [so], dong y cho [Brand/Cong ty]:
1. Su dung sample giong noi cua toi de tao voice clone AI
2. Su dung voice clone trong [pham vi: noi bo / quang cao / podcast / etc.]
3. Thoi han: [tu DD/MM/YYYY den DD/MM/YYYY]
4. Quyen rut lai: Toi co quyen yeu cau xoa voice clone bat ky luc nao
   bang van ban, brand co 7 ngay de xoa hoan toan.
5. Cong khai: Brand cam ket disclose "AI voice" theo quy tac VN.

Ky ten: ____________  Ngay: ____________

4. 3 use case rieng

Use case A: Voiceover ngan TikTok/Reels (Energetic)

Spec:

  • Thoi luong: 15-60s
  • Pace: Fast (180-220 words/phut) — gioi tre VN
  • Tone: Energetic, high-pitch, exciting
  • Audio levels: -14 LUFS (TikTok loudness), peak -1dB
  • CTA: Ro rang trong 5s cuoi

Script template (30s):

[HOOK 0-3s] "Ban co biet [stat shocking]?"
[PROBLEM 3-10s] "Hau het moi nguoi van dang [vong xoay sai]"
[SOLUTION 10-22s] "Toi da thu [phuong phap], va day la 3 dieu..."
[PAYOFF 22-27s] "Ket qua: [so cu the]"
[CTA 27-30s] "Comment 'YES' de minh gui chi tiet"

Voice settings (ElevenLabs):

  • Stability: 35-45 (low — cho phep variation)
  • Similarity: 75-85
  • Style: 50-65 (boost expressiveness)
  • Speaker Boost: ON

Use case B: Podcast 30-60 phut (Conversational)

Cau truc:

  • Intro (1-2 phut): Hook + introduce topic + welcome listeners
  • Body (25-50 phut): 3-5 main segments, moi segment 5-10 phut
  • Ad slot (optional): Sau intro 3-5 phut, hoac giua body
  • Outro (1-2 phut): Recap key points + CTA + thanks

Pacing:

  • Conversational pace: 140-160 words/phut
  • Pause 1-2s sau cau quan trong (cho audience tieu hoa)
  • Doan chuyen segment: pause 2-3s + audio sting

Sound design:

  • Background music: -25 to -30 dB (rat nhe, khong at giong)
  • Stings/transitions: -15 dB, dai 1-2s
  • Voice levels: -16 LUFS (podcast standard), peak -1dB

Voice settings (ElevenLabs):

  • Stability: 60-75 (cao — consistent qua 30+ phut)
  • Similarity: 85-95
  • Style: 30-40 (natural, khong qua expressive)
  • Speaker Boost: ON

Use case C: Audiobook (Mid-tempo)

Cau truc:

  • Chapter intro: "Chuong [X]: [Tieu de]" — pause 2s
  • Chapter body: 10-20 phut/chuong, doan paragraph cach pause 1s
  • Chapter end: Pause 3s truoc khi chuyen chapter

Pacing:

  • Mid-tempo: 150-170 words/phut
  • Breath control: pause tu nhien moi 2-3 cau
  • Doan dialogue: thay doi tone giong nhan vat (neu fiction)

Consistency check (quan trong nhat):

  • Render thu chapter 1 va chapter 5 → so sanh giong → phai giong nhau 95%+
  • Neu khac biet: re-clone voi sample dai hon (5+ phut)
  • Pronunciation guide: lap database ten rieng + cach phat am dac biet

Voice settings (Vbee, neu VN):

  • Voice: "Nu mien Bac chuyen nghiep" hoac "Nam mien Bac am tinh"
  • Speed: 0.95x (cham hon mot chut)
  • Pitch: Default
  • Pause length: 1.2x (dai hon mot chut)

5. Tool comparison VN

ToolGia/thangVN voice nativeEN voiceSetupProsConsBest for
ElevenLabs$5-998/1010/1030 phutMulti-lang, voice clone totVN phat am vai tu khoMulti-lang creator
Vbee199K-1.5M VND9.5/106/1045 phutVN tot nhat, da giong vungKhong manh ENVN-only audio
HeyGen VoiceBundle voi avatar6/108/1015 phutCombo voi avatarVoice clone don dieuCombo voi video
Descript$24-306/109/1030 phutAudio editing manhVN voice yeuPodcast editing
Riverside$19-29n/a (recording)n/a5 phutStudio quality recordingKhong phai TTSLive podcast
Murf$29-797/109/1030 phut120+ voice libraryVoice clone gioi hanCorporate voiceover
PlayHT$39-997/109.5/1030 phutAPI tot, instant cloneUI khoDeveloper/API
Resemble.ai$30-997/109/101 gioCustom emotion controlHoc caoBrand custom voice

Combo khuyen nghi 2025-2026:

  • VN solo creator: Vbee Pro (199K) + Riverside Free + Descript Hobbyist ($24)
  • Multi-lang creator: ElevenLabs Pro ($22) + Riverside Standard ($19) + Descript Pro ($30)
  • Brand/Agency: ElevenLabs Creator ($99) + Vbee Business + Riverside Pro ($29)

6. Workflow 1-on-1 podcast voi AI co-host

Use case: Solo podcaster muon co conversation, khong tim duoc co-host that. AI co-host = giong AI thu 2 dong vai dong host, hoi cau + ban tra loi.

Setup prompt engineering cho AI personality

Buoc 1: Dinh nghia personality cua AI co-host

Ten: [Ten AI co-host]
Tinh cach: Tro mo, hay hoi sau, doi khi hai huoc nhe
Vai tro: Dat cau hoi cho host, khong tu noi qua nhieu
Phong cach noi: Casual, tu nhien, dung "minh/ban" (khong "toi/anh")
Cap do kien thuc: Trung binh — dat cau hoi nhu listener
Cau cam thuong dung: "Wow, hay quoc!", "Vay nghia la sao?", "Cu the hon nha?"

Buoc 2: Tao voice clone rieng cho AI co-host

  • Su dung giong khac voi host (vd: nu vs nam, hoac giong vung khac)
  • Voice clone tu mot nguoi than dong y, hoac dung giong AI co san trong ElevenLabs

Buoc 3: Tool stack

  • ElevenLabs: Tao giong AI co-host (voice clone)
  • Riverside: Recording host noi (live)
  • Descript: Edit + ghep AI co-host vao (text-to-audio)

Script template (Q&A format)

[INTRO]
Host: Chao moi nguoi, hom nay minh va [AI co-host] se ban ve...
AI co-host: Chao cac ban, minh la [ten]. Hom nay minh muon hieu sau ve [topic]
            tu goc nhin cua [host]. Bat dau thoi!

[BODY — 5-7 cap Q&A]
AI co-host: [Hoi cau 1 — broad question]
Host: [Tra loi 2-3 phut]
AI co-host: [Hoi follow-up sau hon]
Host: [Tra loi voi vi du cu the]
... lap lai 5-7 lan ...

[OUTRO]
AI co-host: Cam on [host] da chia se. Toi nhat ma minh hoc duoc la...
Host: Cam on [AI co-host]. Cac ban con cau hoi gi, comment ben duoi...

Tip: Viet truoc 7-10 cau hoi cua AI co-host trong document, host tra loi luot. Sau do generate audio cua AI co-host bang ElevenLabs, ghep vao bang Descript.


7. Repurpose pipeline 1:10 (1 podcast → 10 short clip)

Workflow tong quan

[1] Record podcast 60 phut (Riverside)
        ↓
[2] Transcript tu dong (Descript / Riverside)
        ↓
[3] Identify hooks (10-15 cau hay)
        ↓
[4] Cut clips 30-60s moi cau (Opus Clip / Descript)
        ↓
[5] Add captions (auto-caption)
        ↓
[6] Distribute ra 4 nen tang

Cach identify hooks (cau hay)

Tim trong transcript nhung cau co dac diem:

  • Bold statement: "Toi nghi 90% nguoi VN dang lam sai dieu nay"
  • Counter-intuitive: "Tang gia san pham thuc su lai tang doanh thu"
  • Specific number: "Toi tu 0 len 1 ty trong 6 thang"
  • Personal story: "Lan dau khoi nghiep toi mat 200 trieu"
  • Actionable tip: "3 buoc cu the de bat dau ngay hom nay"

Target: 10-15 hook cho 1 podcast 60 phut. Loc lai 10 clip ngon nhat.

Tool stack

  • Descript: Auto-clip — chon cau, cat ra clip ngon (free tier 1 gio/thang)
  • Opus Clip: AI tu dong tim viral moments + auto-format dung/ngang ($19-99)
  • Riverside Magic Clips: Built-in feature trong Riverside Pro ($29)
  • CapCut + ChatGPT: Manual nhung mien phi — paste transcript, ChatGPT phan tich hooks

Distribution 4 nen tang

Nen tangFormatThoi luongCaptionBonus
TikTok9:16 (1080×1920)30-60sBold caption trenTrend audio overlay (volume thap)
Instagram Reels9:1615-90sSubtitle dep, font sans-serifCover image dep
YouTube Shorts9:16<60sAuto-caption YouTubeTitle chua keyword
LinkedIn audio1:1 (square video voi audio)60-120sCaption ben duoiDoc thread bai dai (carousel)

Pro tip: Moi clip = 1 platform rieng, dung khac caption + cover image. Tang reach.


8. QA audio + Disclosure

5 QA criteria

  1. Clarity (10 diem): Giong ro, khong bi ren, khong lap. Test: phat tren loa dien thoai → van nghe ro
  2. No clipping (10 diem): Peak khong vuot -1dB. Tool check: Audacity, Adobe Audition, Reaper
  3. No background noise (10 diem): Khong tieng quat, xe co, hang xom. Tool: Krisp, NVIDIA Broadcast, Adobe Enhance Speech
  4. Consistent volume (10 diem): Loudness on dinh -16 LUFS (podcast) hoac -14 LUFS (TikTok). Tool: Loudness meter trong DAW
  5. Natural pauses (10 diem): Pause hop ly, khong robot. Manual review: nghe lai 3 lan

Pass: 40+/50 diem. <40 = re-render hoac re-record.

VN disclosure — khi nao bat buoc

Tinh huongDisclosureVi tri
Quang cao thuong maiBAT BUOCCaption + cuoi audio ("Audio nay su dung voice clone AI")
Podcast personal brandNEN — minh bachEpisode description
Audiobook fictionKHONG bat buocOptional — credits cuoi
Tin tuc/giao ducBAT BUOCDau audio + caption
Noi dung noi bo cong tyKHONG bat buocn/a

Template disclosure caption:

Audio nay su dung cong nghe voice clone AI
(ElevenLabs / Vbee / [tool ten]). Noi dung do [Ten ban] viet va duyet.

Reference day du: references/ai-video-disclosure-vn.md — Nghi dinh 147/2024, 3 tang disclose, va template cho tung tinh huong (cung ap dung cho audio).


9. Checklist chat luong

Truoc khi xuat ban audio:

  • Sample voice clone 3-5 phut, phong yen tinh
  • Consent form ky ten (neu clone giong nguoi khac)
  • Use case dung: voiceover (energetic) / podcast (conversational) / audiobook (mid-tempo)
  • Voice settings phu hop use case (Stability/Similarity/Style)
  • Loudness chuan: -14 LUFS (TikTok) / -16 LUFS (podcast/audiobook)
  • Peak khong vuot -1dB (no clipping)
  • No background noise (Krisp/NVIDIA Broadcast pass)
  • Pacing dung: 180-220 wpm (TikTok) / 140-160 wpm (podcast) / 150-170 wpm (audiobook)
  • QA Score 40+/50
  • Disclosure caption (neu commercial use)
  • Repurpose plan: 1 podcast → 10 clip distribute 4 platform

Capabilities

skillsource-minhnv0807skill-25-voice-clone-podcasttopic-agent-skillstopic-ai-agentstopic-ai-avatartopic-anthropictopic-chatgpttopic-claude-codetopic-claude-plugintopic-content-marketingtopic-copilottopic-cursortopic-dropshippingtopic-gemini

Install

Quality

0.62/ 1.00

deterministic score 0.62 from registry signals: · indexed on github topic:agent-skills · 341 github stars · SKILL.md body (14,596 chars)

Provenance

Indexed fromgithub
Enriched2026-05-11 06:54:38Z · deterministic:skill-github:v1 · v1
First seen2026-05-11
Last seen2026-05-11

Agent access