Integrate real-time text-to-speech with Sonic-3.5, Cartesia
Users have highlighted Cartesia's strength in providing customizable 3D avatars equipped with unique personalities and voices, which adds a dynamic layer to interactive sessions. However, there are concerns about limited language support, particularly for low-resource languages like Somali, which some users find restricting. The software is often mentioned in the context of unauthorized or experimental builds, suggesting that it might not have robust official support or channels, potentially impacting its overall reputation. As there is no clear mention of pricing, the sentiment seems neutral or unformed.
Mentions (30d)
0
Reviews
0
Platforms
2
Sentiment
0%
0 positive
Users have highlighted Cartesia's strength in providing customizable 3D avatars equipped with unique personalities and voices, which adds a dynamic layer to interactive sessions. However, there are concerns about limited language support, particularly for low-resource languages like Somali, which some users find restricting. The software is often mentioned in the context of unauthorized or experimental builds, suggesting that it might not have robust official support or channels, potentially impacting its overall reputation. As there is no clear mention of pricing, the sentiment seems neutral or unformed.
Features
Use Cases
Industry
information technology & services
Employees
120
Funding Stage
Venture (Round not Specified)
Total Funding
$191.0M
Pricing found: $0 / month, $1, $4 / month, $5, $39 / month
Anyone working on TTS/ASR for low-resource African or Cushitic languages?
Been building a Somali voice agent. Somali has ~25M speakers but as far as I know there's no production-ready model support anywhere — not ElevenLabs, not Cartesia, nothing. What I tried: - MMS-TTS (facebook/mms-tts-som) — workable baseline but not production quality - Fish Speech V1.5 LoRA — promising but pronunciation wasn't clean enough - XTTS V4 — best results so far, trained on ~300 hours of Somali speech data to 235K steps. Main gotcha: no [so] token in the tokenizer since Somali uses Latin script, had to proxy with [en] TTS pronunciation is getting there. The harder problem is the LLM layer — most models have seen very little Somali text so comprehension and natural response generation is weak. Whisper also struggles with Somali transcription accuracy. Curious if anyone else is working on Somali, Amharic, Tigrinya or similar Cushitic languages — what's actually working? submitted by /u/Expensive-Aerie-2479 [link] [comments]
View originalI built a Claude Code Channel (unauthorized) that allows you to access multiple sessions via web through customizable 3d avatars with personalities and voices.
It's in beta, free. It's not an authorized channel so there are some warnings you'd have to accept. It's been a fun build. You can have multiple Claude Code sessions running in various projects on your computer and Primeta.ai will connect to them all via MCP and can communicate with the sessions. You can choose which persona you want to inject into the session and change them at will, there are 3 default personas and you can create new ones with 3d models and voices (ElevenLabs or Cartesia) and personality prompts. I created a youtube video where I created a sweet grandma assistant and a mean sassy robot assistant. submitted by /u/Beautiful_Reveal_859 [link] [comments]
View originalYes, Cartesia offers a free tier. Pricing found: $0 / month, $1, $4 / month, $5, $39 / month
Key features include: Realistic voice modulation, Emotion-infused speech synthesis, Acronym and initialism recognition, Streaming capabilities for real-time interaction, Multi-language support, Custom voice creation options, User-friendly API for developers, Adaptive learning for personalized responses.
Cartesia is commonly used for: Customer support voice agents, Interactive voice response systems, Educational tools for language learning, Audiobook narration, Virtual assistants for smart devices, Entertainment applications in gaming.
Cartesia integrates with: Slack, Microsoft Teams, Zoom, Salesforce, Shopify, WordPress, Google Cloud, Amazon Web Services, Twilio, Discord.
Ben Firshman
CEO at Replicate
1 mention