Technology Google Voice Assistant

Google denies foul-play in search but admits it needs to give consumers more control over voice data


By Jessica Goodfellow | Media Reporter

July 3, 2017 | 11 min read

As Google faces scrutiny for anti-competitive practices in written search, what does this mean for its big bet on voice?

Google's Jason Spero, VP of performance media, on the future or voice, search, and data

Google's Jason Spero, VP of performance media, on the future or voice, search, and data

News that the European Union has fined Google a record €2.4bn for allegedly fixing search results to favour its own products is the latest in a string of blows to the internet giant aimed at creating a fairer internet ecosystem.

Google’s response when faced with criticism was to contest that it is ‘consumer-first’. It claims to be organised purely by an algorithm that serves the best experience for the consumer.

These anti-competition issues are intensifying as it invests in voice search, where Amazon has already been called out by WPP founder Sir Martin Sorrell for serving up its own products first (Amazon has denied this).

Recognising that voice is where consumers are migrating towards, Google admits to The Drum that it needs to do more to give consumers control over their data.

In an interview with The Drum at Think with Google 2017, Jason Spero, vice president of performance media at Google, spoke at length about the opportunity afforded to brands. According to its own survey, it has seen a significant uptick in users with early half (42%) of the 3,000 people surveyed who had started using voice in last six months doing so on a daily basis.

That said, there are still hurdles to overcome before voice becomes as ubiquitous as written search. Over half (57%) of respondents would use voice search on a more frequent basis if it recognised more complex commands, with a similar amount (58%) stating they would like more detailed results when using search.

Spero believes that machine learning is helping to overcome a lot of the early bumps in the road, a technology that gets smarter the more it is used. Google uses this concept as a reason to allow its voice assistant, Home, to be constantly listening to conversations happening around it, even before the trigger ‘Ok, Google’ is used. The same goes for Amazon’s Echo model.

Security experts have voiced concerns about this 'always on' concept and the risk associated with the private data it stores, at a time when consumers are increasingly demanding more choice as to what data they choose to give away. Should the same rules be applied to voice?

The answer is yes, and Spero admits it’s a simple fix.

How exactly are consumers using voice search?

The first is: for all the progress we have made in typing, voice is a more natural input. People are turning to their phones at times where the voice might get them over a hurdle where they wouldn’t have turned to their phone before.

The second thing that is interesting, we are watching the query streams, what they are saying. Consumers are still discovering what natural language means, as a result it creates this new opportunity because they are going places we haven’t gone with them before. It creates opportunity for brands as well as us to serve that behaviour better, because the machine can do natural language, it’s just historically we haven’t fed it natural language. That creates a much more natural interface.

Will there always be a space for text?

At any point where you introduce something new people wonder whether the new paradigm will replace the old. For the vast foreseeable future the deep majority of how people interact will still be through text and through the interfaces we are familiar with today.

Accuracy in voice search was historically a problem with devices having trouble understanding different voices — is that no longer an issue? And, if not, when do you think we’ll reach that point?

This is a place where I am proud of Google’s leadership. We invested early not just in my version of English but the many languages spoken and all the ways people talk around the world. It is a place where machine learning is helping dramatically. This is a place where I think Google is unparalleled. I don’t want to declare victory by a long shot, I think it is asymptotic and we will continue to work on it.

I think we are well past the point of quality where we know that this is going to work. Will we have a small error rate? Sure. But I think that that is going away.

How does machine learning help with this?

The more that we see people using these devices the more the machine learning can do its job and recognise what they want. The essence of machine learning is you need data to train a model and to understand at the end did you get that right? This problem gets easier as you have more data to work with.

There is a tipping point here, the better it gets the more people use it, the more people use it the better it gets.

Consumers have voiced concerns about the fact that voice devices are always listening to their conversations unless they switch the device off at source. What are the privacy issues associated with this, and will you introduce a feature that allows users to control how much of their life is recorded?

The device will recognise different voices. My device will respond to me but it will respond to my daughter’s differently. That is important because there should be rules about what I get and what my daughter gets. You want to have moments when the device recognises you are engaging it, and moments when the device is not engaged when you are having a private conversation. That is an engineering challenge that is solvable so that it is not binary as a device unplugged.

We should be able to give every household confidence that the device is there for them when they want it, but is not engaged in their dialogue when they don’t want it to be.

Surely an easy way to do that is to give the consumer the option to say ‘stop listening now’?

Yeah we should do that. I don’t know the roadmap, but as a general principle this device is there to serve the consumer’s need, and we should be able to give them the ability to action it.

That is why you have the ‘OK, google’ trigger moment - to give the consumer confidence that in the moments where you are having the family conversation that the device isn’t part of the experience.

For consumers peace of mind, if they go to My Account in Google they can see in My Activity what has been registered from Google Home in terms of what they said that has been activated by ‘Ok, Google’.

So how does voice factor into Google’s future revenue generation plans?

The short answer is I don’t know. We are not yet focused on any paradigm other than really driving the marketing opportunities for our customers that we know today, which are, how do you use the explosion in video content to help serve marketers needs, how is search evolving based on all these touch points, and making that easier for marketers every day?

Our energy is going into driving user experiences, figuring out what people want to do on these devices and how to de-friction. We are not putting any energy into business models around any of that today.

When does Google intend to introduce paid-for search opportunities on voice-based search?

We don’t know, that is not where we are putting our energy just now. I’m not going to tell you we are not going to have a search experience but I haven’t spent a lot of time contemplating it.

During your talk you used the phrase 'better experiences drive preference', but it seems voice assistants, however advanced technologically, still have issues to iron out, like error messages. Does this risk turning consumers off that technology?

The reason we are highlighting how important UX is, is that the consumer is delighting from almost everything we give them on the phone. We know the things consumers want out of these devices. We are focused on the explosion in engagement on mobile device and the fact that most brands haven’t yet built the experience that those consumers expect.

The consumers that choose new experiences have a different expectation and tolerance for things going wrong. If I download an operating experience when it is still in Alpha I understand that it is going to crash. People understand that in the early days of things, the experience isn’t well worn, the trail may not be paved, and I think that is ok.

The Google Home devices and the assistant in the phone have got to an error rate where it doesn't interrupt my experience at all.

Google talks about page load times inhibiting UX, a lot of times this is due to too much code being on place on a publisher site. But publishers are looking to monetise their digital propositions as much as possible to offset declines in advertising. How is AMP helping with this?

We are big believers in quality journalism and trying to be partnered through our digital news initiative with the best journalists out there. The best way we can help is with monetisation ourselves. We are trying to constantly improve monetisation for those publisher sites.

We launched early in the spring Ads for AMP. One of the challenges of AMP - it might have improved the speed which improves how many people choose that news outlet, but there wasn’t an easy way to monetise that which was a hurdle. We always had a plan we just had to execute on it. It is not clear that you will be better off from a monetisation standpoint but you won’t be worse off, and you will have made a dramatic improvement on speed.

Both Google and Amazon talk a lot about the user experience being first and foremost. Others argue that Google favours its own products which goes against that, as evidenced in this week's EU fine - how do you respond to that?

What I can tell you about is the culture at Google, which is unrelenting in looking at data for what satisfies the consumer, grounded in organising the world’s information and bringing that to them. So if they are looking for a pair of Levis jeans what is the best way for them to get them that they choose. It is about providing bar-none the best experience that they can, in an algorithmic consistent way, that gets them the best shopping experience at that time, the best way to find a Taylor Swift song, or the best place to test drive a Honda car, for example.

All of that is built in an extremely mathematical way culturally. To the extent that one brand has a better experience, the algorithm will prioritise that in what is put in front of the user.

I feel very confident that we are putting the customer needs first and if that’s what’s organising our effort then I don’t think we have to be too concerned. There is a simple North Star around building for user experience, that makes everything else fairly simple.

Technology Google Voice Assistant

More from Technology

View all


Industry insights

View all
Add your own content +