Voxtral can adapt a custom voice from a sample of less than five seconds, even picking up on accents and intonation.
It smoothly switches between languages without changing how the voice sounds, making it great for dubbing or real-time translation.
It's fast too (It has a time-to-first-audio (TTFA) of 90 ms for a 10-second sample of 500 characters) and works on everything from smartwatches to laptops.
As Mistral's vice president, Pierre Stock, puts it: The company built a small-sized speech model that can fit on a smartwatch, a smartphone, a laptop, or other edge devices and costs a fraction of anything else on the market while offering state-of-the-art performance.
Contact to : xlf550402@gmail.com
Copyright © boyuanhulian 2020 - 2023. All Right Reserved.