AI Audio/Visual AI Marketing

Are AI voiceovers the future?

PinPoint Media

|

Open Mic article

This content is produced by a publishing partner of Open Mic.

Open Mic is the self-publishing platform for the marketing industry, allowing members to publish news, opinion and insights on thedrum.com.

Find out more

October 5, 2021 | 6 min read

Chances are you are familiar with AI voices

But for many years - from Siri to Alexa - they’ve not sounded all that human. That is now changing. With advances in technology, AI voices are becoming more flexible and realistic, and are therefore being used more than ever.

Are AI voiceovers the future? And what impact will this have on the media industry?

AI voice improvements

The latest breakthroughs in the quality of AI voices are down to deep learning, according to the MIT Technology Review. Deep learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. Synthetic voices such as Siri and Alexa use individually recorded words, which are combined to create sentences. The end result can often sound strange, as the voices lack cadence, pauses and breathing. Adding these elements in can be incredibly manual.

The deep learning approach takes a few hours of audio and through an algorithm learns and replicates the individual pattern of each voice to create audio. This means words can be used in the audio that were not present in the initial recordings, making these new AI voices much more flexible. This presents us with more opportunities as marketers.

But if we all switched to AI voices, would voiceover artists become obsolete?

The impact on voiceover artists

A key step when creating synthetic voices via deep learning is having original audio to utilize. As such, a lot of businesses offering AI voices are working in collaboration with voiceover artists to create AI versions of their voice. This approach could lead to voiceover artists being paid for the time spent creating initial recordings, as well as potentially receiving royalties, each time their AI voice is used.

However, there is already evidence of voiceover artists being taken advantage of when it comes to synthetic voices. The original voice of Siri, Susan Bennett, was contracted by ScanSoft and didn’t know the recordings were for Apple. Although she gave ScanSoft permission to use her recordings anywhere, she was paid for the original recordings and not for their continued use.

Bev Standing, another voiceover artist, is currently suing TikTok after alleging the company used her voice for its text-to-speech feature without compensation or consent. In the industry, the worry is that contracts and fees have not yet caught up with the technology available. So voiceover artists can unknowingly sign away use of their voice forever for a minimal fee.

There is also the added concern that with unclear consent and contracting, voices could be used in a negative way that could bring voiceover artists into disrepute. Whether that is being used in a sexualised video game, a context with increased swearing and bad language, or simply for a company that they might not choose to work with.

Should we be using AI voices?

Just because AI voices are available, should we be using them? More importantly, is it the best option? With synthetic voices there are a few things to think about:

What do you need a voiceover for? Is it a supplementary or core aspect of your content?

For example, at PinPoint Media we choose to use synthetic voices for the audio versions of our blogs. We do this to provide an alternative option to reading. We produce so many blogs it wouldn’t be cost or time effective for us to record our own audio versions. So a synthetic voice gives us an alternative option. For this particular scenario we’re excited that these voices may become more realistic in time.

In comparison, when we are creating audio voiceover for an internal video, a key element of the process is authenticity and emotion. Although AI voices have improved in these areas, we can achieve a higher quality voiceover by recording this in-house with members of our team who can talk confidently about the subject matter.

Equally, when we are adding a voiceover to a video or animation for our clients, emotion is often a key component in this process. It's crucial to be able to direct a voiceover artist to get the right end result to ensure it blends seamlessly with the visual content. This is something that our clients depend on us to get right to ensure the project has the right level of emotional depth and engages with their target audience.

Senior Animation Producer Jessica Barder had this to say: “More often than not a voiceover is key to leading and driving our animated content. There are many occasions when I will jump on a call to direct the voiceover artist who will have been selected by our client. Call me traditional but I do not believe for a second that you would get the same bespoke quality of delivery from AI voiceovers. Working collaboratively with the talent, you are able to achieve a great result. There is definitely a place for AI voiceovers, but the project and its purpose would need to be carefully considered.”

If you are thinking of using synthetic voices, you need to consider your project to weigh up whether they are the right fit. Although the cost and time impact of recording audio may well be a factor in your decision making, so should the importance of quality in your end result. Previously we wrote about why you should use professional voiceover artists and the many benefits that can be gained from working with trained professionals. It’s worth checking out our reasoning to help your decision making.

Are AI voices the future?

Are AI voices the future? Yes and no. There is already a place for synthetic voices, and as AI voices improve over time we can see the contexts in which they are used growing. However, we believe there will always be a place for professional voiceover artists. First, because they are needed to help ensure AI voices continue to improve, and second because if you need a voiceover that engages on an emotional level with your target audience, you need a human.

Here at PinPoint Media we produce data-driven content from video and animation, to voice, audio and content strategy. Visit our website to find out more about our services.

AI Audio/Visual AI Marketing

Trending

Industry insights

View all
Add your own content +