整理了近期所有TTS相关的大模型
AI魔法学院
2024-04-25
分享海报

 XTTS 到 Pheme,从OpenVoice 到 VITS,每个大模型包括源码地址,支持的语言,非常棒!

 

 

XTTS 

  [Repo](https://github.com/coqui-ai/TTS)

[](https://huggingface.co/coqui/XTTS-v2)

[CPML](https://coqui.ai/cpml)

[Yes](https://huggingface.slack.com/archives/C05QZTQJUDD/p1705418518292139)

Multilingual

[Technical notes](https://erogol.substack.com/p/xttsv2-notes)

[](https://huggingface.co/spaces/coqui/xtts)

 

TorToiSe-TTS

[Repo](https://github.com/neonbjb/tortoise-tts)

[](https://huggingface.co/jbetker/tortoise-tts-v2)

[Apache 2.0](https://github.com/neonbjb/tortoise-tts/blob/main/LICENSE)

[Yes](https://git.ecker.tech/mrq/tortoise-tts)

English

[Technical report](https://arxiv.org/abs/2305.07243)

[](https://huggingface.co/spaces/Manmay/tortoise-tts)

 

VITS/ MMS-TTS

[Repo](https://github.com/huggingface/transformers/tree/7142bdfa90a3526cfbed7483ede3afbef7b63939/src/transformers/models/vits)

[](https://huggingface.co/kakao-enterprise) / [MMS](https://huggingface.co/models?search=mms-tts)

[Apache 2.0](https://github.com/huggingface/transformers/blob/main/LICENSE)

[Yes](https://github.com/ylacombe/finetune-hf-vits)

English

[Paper](https://arxiv.org/abs/2106.06103)

[](https://huggingface.co/spaces/kakao-enterprise/vits)

 

Pheme

[Repo](https://github.com/PolyAI-LDN/pheme)

[](https://huggingface.co/PolyAI/pheme)

[CC-BY](https://github.com/PolyAI-LDN/pheme/blob/main/LICENSE)

[Yes](https://github.com/PolyAI-LDN/pheme#training)

English

[Paper](https://arxiv.org/abs/2401.02839)

[](https://huggingface.co/spaces/PolyAI/pheme)

 

OpenVoice

[Repo](https://github.com/myshell-ai/OpenVoice)

[](https://huggingface.co/myshell-ai/OpenVoice)

[CC-BY-NC 4.0](https://github.com/myshell-ai/OpenVoice/blob/main/LICENSE)

No

ZH + EN

[Paper](https://arxiv.org/abs/2312.01479)

[](https://huggingface.co/spaces/myshell-ai/OpenVoice)

 

IMS-Toucan

[Repo](https://github.com/DigitalPhonetics/IMS-Toucan)

[GH release](https://github.com/DigitalPhonetics/IMS-Toucan/tags)

[Apache 2.0](https://github.com/DigitalPhonetics/IMS-Toucan/blob/ToucanTTS/LICENSE)

[Yes](https://github.com/DigitalPhonetics/IMS-Toucan#build-a-toucantts-pipeline)

Multilingual

[Paper](https://arxiv.org/abs/2206.12229)

[](https://huggingface.co/spaces/Flux9665/IMS-Toucan)

 

Matcha-TTS

[Repo](https://github.com/shivammehta25/Matcha-TTS)

[GDrive](https://drive.google.com/drive/folders/17C_gYgEHOxI5ZypcfE_k1piKCtyR0isJ)

[MIT](https://github.com/shivammehta25/Matcha-TTS/blob/main/LICENSE)

[Yes](https://github.com/shivammehta25/Matcha-TTS/tree/main#train-with-your-own-dataset)

English

[Paper](https://arxiv.org/abs/2309.03199)

[](https://huggingface.co/spaces/shivammehta25/Matcha-TTS)

 

pflowTTS

[Unofficial Repo](https://github.com/p0p4k/pflowtts_pytorch)

[GDrive](https://drive.google.com/drive/folders/1x-A2Ezmmiz01YqittO_GLYhngJXazaF0)

[MIT](https://github.com/p0p4k/pflowtts_pytorch/blob/master/LICENSE)

[Yes](https://github.com/p0p4k/pflowtts_pytorch#instructions-to-run)

English

[Paper](https://openreview.net/pdf?id=zNA7u7wtIN)

Not Available

 

StyleTTS 2

[Repo](https://github.com/yl4579/StyleTTS2)

[](https://huggingface.co/yl4579/StyleTTS2-LibriTTS/tree/main)

[MIT](https://github.com/yl4579/StyleTTS2/blob/main/LICENSE)

[Yes](https://github.com/yl4579/StyleTTS2#finetuning)

English

[Paper](https://arxiv.org/abs/2306.07691)

[](https://huggingface.co/spaces/styletts2/styletts2)

 

VALL-E

[Unofficial Repo](https://github.com/enhuiz/vall-e)

Not Available

[MIT](https://github.com/enhuiz/vall-e/blob/main/LICENSE)

[Yes](https://github.com/enhuiz/vall-e#get-started)

NA

[Paper](https://arxiv.org/abs/2301.02111)

Not Available

 

HierSpeech++

[Repo](https://github.com/sh-lee-prml/HierSpeechpp)

[GDrive](https://drive.google.com/drive/folders/1-L_90BlCkbPyKWWHTUjt5Fsu3kz0du0w)

[CC-BY-NC-SA 4.0](https://github.com/sh-lee-prml/HierSpeechpp/blob/main/LICENSE)

No

KR + EN

[Paper](https://arxiv.org/abs/2311.12454)

[](https://huggingface.co/spaces/LeeSangHoon/HierSpeech_TTS)

 

Bark

[Repo](https://github.com/huggingface/transformers/tree/main/src/transformers/models/bark)

[](https://huggingface.co/suno/bark)

[MIT](https://github.com/suno-ai/bark/blob/main/LICENSE)

No

Multilingual

[Paper](https://arxiv.org/abs/2209.03143)

[](https://huggingface.co/spaces/suno/bark)

 

EmotiVoice

[Repo](https://github.com/netease-youdao/EmotiVoice)

[GDrive](https://drive.google.com/drive/folders/1y6Xwj_GG9ulsAonca_unSGbJ4lxbNymM)

[Apache 2.0](https://github.com/netease-youdao/EmotiVoice/blob/main/LICENSE)

[Yes](https://github.com/netease-youdao/EmotiVoice/wiki/Voice-Cloning-with-your-personal-data)

ZH + EN

Not Available

Not Available

 

参考地址:

https://github.com/Vaibhavs10/open-tts-tracker/tree/main

 

 

 

 

出自:https://mp.weixin.qq.com/s/c2sICIdX3lcFBgpZS4uEzg

© THE END

转载请联系本网站获得授权

投稿或版权问题请加微信:skillupvip