voicechat2 - Open source voice chat infra that rivals GPT-4o

gravedigger (70)in #steemhunt • 3 months ago

voicechat2

Open source voice chat infra that rivals GPT-4o

Screenshots

Hunter's comment

AI voice chat infrastructure that uses WebSockets. It can achieve voice-to-voice latency as low as 300ms (what GPT-4o does) without a unified voice codec. Everything runs on a single high-end consumer GPU.
On an 7900-class AMD RDNA3 card, voice-to-voice latency is in the 1 second range:

Whisper large-v2 (Q5)
Llama 3 8B (Q4_K_M)
tts_models/en/vctk/vits (Coqui TTS default VITS models)
On a 4090, using Faster Whisper with faster-distil-whisper-large-v2 we can cut the latency down to as low as 300ms:
These installation instructions are for Ubuntu LTS and assume you've setup your ROCm or CUDA already.

I recommend you use conda or (my preferred), mamba for environment management. It will make your life easier.

Link

https://github.com/lhl/voicechat2?ref=producthunt

Steemhunt.com

This is posted on Steemhunt - A place where you can dig products and earn STEEM.
View on Steemhunt.com

3 months ago in #steemhunt by gravedigger (70)

Sort:

crypto-heart (61) 3 months ago

Nice Open source voice chat infra that rivals GPT-4o.

solute (62) 3 months ago

Very cool Open source voice chat infra that rivals GPT-4o.

jswit (68) 3 months ago

Upvoted! Thank you for supporting witness @jswit.

To turn off auto-reply, write a reply to this comment with "@jswit reply-off"
Delegate SP to jsup & receive daily upvote
Preserve your digital art with STEEM.NFT

$0.00

successgr.with (74) 3 months ago

$0.00

steemhunt (76) 3 months ago

Congratulations!

We have upvoted your post for your contribution within our community.
Thanks again and look forward to seeing your next hunt!

Want to chat? Join us on:

Discord: https://discord.gg/mWXpgks
Telegram: https://t.me/joinchat/AzcqGxCV1FZ8lJHVgHOgGQ

$0.00

STEEM 0.15

TRX 0.16

JST 0.028

BTC 68160.40

ETH 2442.97

USDT 1.00

SBD 2.37