Microsoft’s VALL-E 2: First Time Human Parity in Zero-Shot Text-to-Speech Achieved | Synced
In a recent new paper VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers, a Microsoft research team presents VALL-E 2, the latest advancement in neural co...
Source: Synced | AI Technology & Industry Review
In a recent new paper VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers, a Microsoft research team presents VALL-E 2, the latest advancement in neural codec language models. This innovation marks a milestone in zero-shot TTS synthesis by achieving human parity for the first time.