| by Merkle|Periscopix

Forget big data: you want big analysis

Forget big data: you want big analysisForget big data: you want big analysis

You probably don’t have big data. The NHS, they have big data. Ordinance Survey, Facebook. Big data isn’t measured by entries in a database or gigabytes, but by the number of datacentres, the complexity, and by how poorly it all fits together.

If you’re a marketer you have access to a lot of data: about your website, your incoming traffic, your customers. Using that well is difficult.

Be careful what you do with the data you have
The temptation towards data mining is strong, but you need to resist it. Data mining starts with the data first, and finds relationships. If you have plenty of variables you can plug them into a large and (hopefully) sophisticated regression model, to see how strong the relationships are. In fact with a computer you can run every possible regression relationship to see which are strongest. It sounds great.

The danger comes from just how many relationships there are if you have a lot of data. Running a statistical test typically takes the following form: you choose your hypothesis (I believe that A and B are strongly related), then you look at your data to judge the chance of A and B showing a relationship in your sample even if they’re not really. Basically, you’re trying to reduce the chance of false positives.

How badly can this go wrong?
By setting a very small error tolerance, you reduce the chance of false positives, but also raise the bar that your data has to match to even show a real relationship as true.

If you have 100 possible relationships, and you set a 5% error tolerance, then you can expect in the region of five false positives. These can be stronger than the actual relationship. By running a data mining process in this way you risk choosing a relationship that looks very strong in your data set, but may not be real.

If your search marketer tells you they want to run some data mining, be careful. AdWords provides 60 variables at keyword level alone. That’s over one quintillion possible relationships. You’ll end up with false positives measured in the quadrillions.

So what can you do?
Luckily, quite a lot. It’s not as bad as it seems. There’s no need to engage in data mining. Instead you can go back to the old favourite: scientific method. Create a hypothesis, test the hypothesis, and reject it if the data doesn’t support it. By formulating the hypothesis based on logic, you can avoid the danger caused by testing too many hypotheses.

As a marketer, you actually know a lot about the relationships already. Google tell you, Yahoo tell you, there are thousands of best-practice guides about email marketing and response rates etc that all purport to tell you how your variables interact. If you bid more on AdWords, you can expect to win more auctions. That relationship is clear. So when you’re trying to interpret your own data, you can start with that framework.

Next steps?
From there, your options expand a lot. You might not have Big Data, but you probably have a lot of data. So consider your process one of feedback and response.

You’ve formulated your plan, you’ve made a change as a result, and you expect a response. If you get that response: Huzzah! You can then make your next change. By using an iterative process you learn a lot, but mostly you actually improve your marketing along the way.

This kind of real time feedback is unique to digital. Traditionally a media planner would create the campaign, then after it was finished prepare a report about how closely it matched the plan. But now they have tools at their disposal to analyse everything: placements, demographics, context; and to work out how they each performed in relation to one another, and even how they overlapped. And best of all there’s no need to wait until the end of the campaign. After the first few days it might be possible to cut a huge chunk of wasted spend, and make the majority of the campaign more efficient.

Lots of little data
Good use of data is vital, but data mining is bad use of data. As a marketer most of the options are obvious and easy. So don’t waste your time thinking about big data and waiting for it to be useful. Use the huge amounts of little data you have, and start making things better right away.

Alistair Dent
Director of Paid Media

Tel: 020 7234 0500
Email: enquiries@periscopix.com
Web: www.periscopix.com
Twitter: @Periscopix