AI Coding：OmniVoice完成声音克隆

Posted 2026-04-8 Updated 2026-04- 8

By boommanpro

4~5 min read

Overview

Office Web: https://github.com/k2-fsa/OmniVoice

OmniVoice 是一个支持 600+ 语言的零样本文本转语音（TTS）模型，具有以下特点：

- 🌍 600+ 语言支持：最广泛的语言覆盖

- 🎭 声音克隆：从短音频样本克隆声音

- 🎨 声音设计：通过属性（性别、年龄、音调、口音等）控制声音

- ⚡ 快速推理：RTF 低至 0.025（比实时快 40 倍）

项目启动

git clone https://github.com/k2-fsa/OmniVoice.git

prompt: 这个项目是什么，如何本地启动，并且运行，完成demo


export HF_ENDPOINT="https://hf-mirror.com" && omnivoice-demo --ip 0.0.0.0 --port 8001

演示页面

演示地址：http://localhost:8001/

命令行使用方式

# 声音克隆
omnivoice-infer --model k2-fsa/OmniVoice \
    --text "你好，这是一个测试。" \
    --ref_audio ref.wav \
    --output output.wav

# 声音设计
omnivoice-infer --model k2-fsa/OmniVoice \
    --text "你好，这是一个测试。" \
    --instruct "female, 四川话" \
    --output output.wav

编程分享, 算法记录, 个人常用