Voice as an interface is fast becoming a major piece of the digital ecosystem.
The number of Americans using a voice assistant device is forecast to grow 129% to 36 million this year, according to market research firm eMarketer. While some analysts predict that voice will account for 200 billion searches per month, by 2020.
Advances in machine learning have radically improved computers’ ability to recognise human speech. We’re not only seeing faster-than-expected consumer uptake of voice-enabled speaker devices – including the Amazon Echo, Google Home and Apple HomePod – we are also seeing the integration of voice assistants into fridges, cars, furniture and even the workplace.
Strong consumer uptake has led a growing appetite among brands to build voice-enabled experiences that look to build relationships with users and drive revenues. From a brand perspective, it is therefore important to understand how the technology is likely to evolve both now, next and in the future.
At Isobar we believe there are three key stages in the evolution of voice as an interface.
Establishing an ecosystem
We are currently witnessing a land grab in the voice hardware market with Amazon, Apple and Google and others developing hardware solutions on their own platforms to lock in consumers for future services.
People who buy a Google Home speaker, which responds to the phrase “OK Google”, become part of the Google ecosystem. The search engine giant will funnel these participants to engage with YouTube, Google Maps and Assistant, for instance. The same principle applies to Amazon and Apple.
While all of them provide smart home control and integration, there are key differences between the voice services strengths and weaknesses.
Apple’s focus will likely be on audio quality – Apple Homepod, powered by Siri, is due to launch this December – whereas Amazon Echo leverages the efficiencies of ecommerce through Amazon and more specifically the customer data it harvests through Prime Membership.
The strength of Google Home, meanwhile, is centered around its dominance of search and its wider ecosystem covering Gmail, calendars and maps.
These brands are intent on creating a regular set of touch points with us that enable them to bore their way into our lives, plant roots and ultimately drive revenues. The hardware - Amazon Echo, Google Home and Apple HomePod - is therefore just one part, or pillar, of the digital ecosystem these technology vanguards are seeking to create.
It’s also worth noting that Facebook has been reportedly working on a smart speaker that would use a new voice interface, yet nothing has been officially confirmed.
The software battle
In the mid-term, consumers are unlikely to want multiple voice devices in their home, which leads to thinking that voice will transition more into a software battle rather than a hardware one. Eventually we will have one device which will provide all the different services we need from the different providers.
Building a walled garden of owned services – defined as controlling a stack of hardware and software – doesn’t feel like the smart strategy in the mid-term, as consumers want and expect to manage their digital lives from a singular touchpoint, not multiple ones.
This is particularly true when it comes to the connected home, with one single technology provider unlikely to cover all the hardware needs of today’s consumer.
Amazon and Microsoft have already acknowledged this by announcing a partnership to better integrate their Alexa and Cortana voice assistants – Microsoft’s Bing search engine is already the default search capability on Alexa. This cross-platform initiative will allow Alexa users to access Cortana and vice versa, on several devices.
Amazon founder Jeff Bezos has also intimated greater integration with Apple and Google. “There are going to be multiple successful intelligent agents, each with access to different sets of data and with different specialized skill areas,” said Bezos in a press statement. “Together, their strengths will complement each other.”
This openness could reflect a concern within Amazon’s ranks that voice assistants from Apple and Google will become commonly used on smartphones (an area of weakness for the Besos-led empire), namely the iPhone and Pixel ranges respectively, negating the need for a centralised voice assistant in the home. Alexa, in this instance.
It will be fascinating to see how this dynamic evolves going forward, as consumer wants clash with the commercial and brand principles of the world’s major technology organisations.
Where does voice fit in the future?
The long-term success of voice-enabled technologies depends on three factors (1) the ability of Amazon, Apple and others to develop the technology to maturity, (2) the way in which voice will complement other emerging technologies and (3) the appetite among brands and retailers to create new, functional or compelling experiences on the platforms.
It’s true that voice assistants need to mature to account for numerous challenges, ranging from managing the unpredictability of human behaviour to consumer trust issues and the infancy of experience design for voice technology as a whole.
Privacy and the recording of conversation is also a concern when the interface is invisible. A clear understanding of the data being captured, analysed and even shared is an elephant in the room of voice. Technology companies spearheading its development need to show greater transparency and to own responsibility for data security, in this regard.
Secondly, how voice assistants complement other emerging technologies, as well as the more established, will be crucial in determining the success of the technology.
We are fast moving from an era dominated by digital screens to one where technology will be embedded in every object, without any visible user interface. This means that interfaces powered by gestures, voice, gaze and even brain activity will become more prevalent, especially as technologists seek to understand what will replace the smartphone interface.
In relation to the third point, the onus is on the retailer and brand community, supported by the likes of Amazon, Google and agency partners, to develop voice as a channel and find breakthrough use cases, like Pokémon Go from an augmented reality standpoint or Apple Pay with mobile payments.
Brands are already building voice-enabled technology to deliver all manner of experiences.
BMW, for instance, has already said it will incorporate Alexa into its cars beginning in 2018, letting drivers use voice commands to toggle the radio, get directions, or check the news, whilst luxury apparel retailer Rebecca Minkoff uses Alexa to answer key questions about their business, in real time.
“How are online sales performing?”, “how’s our business doing across wholesale?”, and “what’s selling well in Los Angles?”. These are just some of the questions Uri Minkoff, CEO of Rebecca Minkoff, asks Alexa daily from his office.
It’s important to note that voice does not necessarily just have to be about improving convenience and reducing friction, however. There is also opportunity for brands to create compelling and entertaining experiences using this technology.
Diageo and Isobar teamed up to create a more compelling experience at Cannes Lion in 2017, by creating a skill for Echo where users are invited to select drinks designed by award-winning mixologist, Rob Poulter.
Drinks included gin based cocktails such as Elderflower Fitz and Negroni and mocktails such as Seedlip Spring Garden and Seedlip Spiced Apple. Alexa then used taste and contextual triggers to promote discovery, aid recommendation, and assist with ordering. Guests then received their drinks at the table without interruption to their experience.
For brands, voice technologies pose interesting challenges – how do we turn our tone of voice into an actual voice? How do we integrate customer services, brand communications and commerce? And of course, how do we best combine our human staff with the technology?
When it comes to building voice assistants, the emphasis should always be on teaching your skill to understand the user and to speak as opposed to the other way around.
It’s key therefore that brands and retailers map out conversations, before a line of code has been written, and spend time thinking about the context in which it will be used. As with most experience-enabling technologies, if the application doesn’t solve a pain point or excite customers, it will not succeed.
Through building our own Alexa-built voice applications, we have learnt that you can’t scrimp on testing. Every user will approach the conversation with voice-enabled devices in a different way and it’s important that as many of these nuances are captured and accounted for through development.
To date, no breakthrough use case has emerged for voice. The unexpected success of SMS when mobile phones were first launched illustrates how new technologies can quickly centre on a simple use case.
Curious brands that invest in understanding the technology early will therefore have a greater chance of finding these use cases and defining voice as a channel to market.
Alex Hamilton is head of insight at Isobar UK