https://ai.googleblog.com/2021/08/soundstream-end-to-end-neural-audio.html
- 음성과 음악 양쪽에 적용 가능한 최초의 뉴럴 네트워크 코덱
ㅤ→ 실시간으로 스마트폰 CPU에서 실행 가능
ㅤ→ 고품질 오디오 및 깨끗한 음성, 잡음과 잔향이 많은 음성, 음악 및 환경소리를 포함한 다양한 사운드 유형을 인코딩
- 양쪽 종단에서 신경망을 이용하여 훈련되어, 압축과 음질향상을 동시에 수행하여 높은 품질의 오디오 제공
ㅤ→ SoundStream은 3kbps 에서 Opus 12 kpbs 를 능가하고, EVS 9.6bps 정도의 퀄리티를 제공
ㅤ→ 3.2x-4x 적은 비트만 이용하므로 전송량을 대폭 줄일 수 있음
ㅤ→ 훌륭한 수준의 잡음 제거 가능
- 올해 초에 공개했던 Low-Bitrate 음성용 코덱 Lyra에 통합 예정
Detect language Afrikaans Albanian Amharic Arabic Armenian Azerbaijani Basque Belarusian Bengali Bosnian Bulgarian Catalan Cebuano Chichewa Chinese (Simplified) Chinese (Traditional) Corsican Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Frisian Galician Georgian German Greek Gujarati Haitian Creole Hausa Hawaiian Hebrew Hindi Hmong Hungarian Icelandic Igbo Indonesian Irish Italian Japanese Javanese Kannada Kazakh Khmer Korean Kurdish Kyrgyz Lao Latin Latvian Lithuanian Luxembourgish Macedonian Malagasy Malay Malayalam Maltese Maori Marathi Mongolian Myanmar (Burmese) Nepali Norwegian Pashto Persian Polish Portuguese Punjabi Romanian Russian Samoan Scots Gaelic Serbian Sesotho Shona Sindhi Sinhala Slovak Slovenian Somali Spanish Sundanese Swahili Swedish Tajik Tamil Telugu Thai Turkish Ukrainian Urdu Uzbek Vietnamese Welsh Xhosa Yiddish Yoruba Zulu
Afrikaans Albanian Amharic Arabic Armenian Azerbaijani Basque Belarusian Bengali Bosnian Bulgarian Catalan Cebuano Chichewa Chinese (Simplified) Chinese (Traditional) Corsican Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Frisian Galician Georgian German Greek Gujarati Haitian Creole Hausa Hawaiian Hebrew Hindi Hmong Hungarian Icelandic Igbo Indonesian Irish Italian Japanese Javanese Kannada Kazakh Khmer Korean Kurdish Kyrgyz Lao Latin Latvian Lithuanian Luxembourgish Macedonian Malagasy Malay Malayalam Maltese Maori Marathi Mongolian Myanmar (Burmese) Nepali Norwegian Pashto Persian Polish Portuguese Punjabi Romanian Russian Samoan Scots Gaelic Serbian Sesotho Shona Sindhi Sinhala Slovak Slovenian Somali Spanish Sundanese Swahili Swedish Tajik Tamil Telugu Thai Turkish Ukrainian Urdu Uzbek Vietnamese Welsh Xhosa Yiddish Yoruba Zulu
Text-to-speech function is limited to 200 characters