Spotify will translate podcasts into other languages using AI

Innovation

By Ty Roush

Spotify will start using artificial intelligence to translate podcasts into other languages, the company announced Monday as part of a partnership with OpenAI, as Spotify becomes the latest to use generative AI for its products.

Music App Spotify down in L’Aquila, Italy, on November 27, 2020. Spotify is down around the world on iOS and Android. (Photo illustration by Lorenzo Di Cola/NurPhoto via Getty Images)

Key Takeaways

Spotify announced Monday that it is releasing a pilot for its “Voice Translation” feature, which will translate podcasts into other languages that will match the original speaker’s voice and style.
The feature was created in partnership with OpenAI, which announced Monday that it was releasing new “voice and image capabilities” for ChatGPT, allowing users to speak with the AI chatbot with generated “human-like audio from just text and a few seconds of sample speech.”
Spotify’s new feature is based on OpenAI’s voice transcription tool Whisper, which transcribes English speech and translates other languages into English.
The pilot includes three podcast episodes—including the Lex Fridman Podcast, Armchair Expert and The Diary of a CEO with Steven Bartlett—available in Spanish to both subscribed and unsubscribed users, with episodes available in French and German “in the coming days and weeks.”
All translated podcasts will be available in Spotify’s “Voice Translations Hub,” which will be updated with additional episodes and podcasts “over the coming weeks and months.”

Big Number

100 million. That’s how many users “regularly” listen to podcasts on Spotify, according to the company.

Key Background

Other companies have started to use generative AI for its products in recent months. Meta announced earlier this year that it would release AudioCraft, a tool that allows users to create AI-generated music and sounds. The Financial Times reported in August that Google and Universal Music Group were in discussions over whether to license artists’ melodies and vocals for AI-generated music.

Google also announced earlier this month that it had integrated its AI chatbot Bard into its other applications, including YouTube, Gmail and Drive. Amid concerns over privacy and safety, OpenAI said it was gradually releasing its image and voice capabilities for ChatGPT, after the company warned the new features could present new risks, including “the potential for malicious actors to impersonate public figures or commit fraud.”