Cargo cult analytics

This is a talk I gave at a Hacks/Hackers meetup in Berlin on August 21, 2013.

Thanks for inviting me to speak at this Hacks/Hackers meetup. My name is Stijn Debrouwere. I’m a Knight-Mozilla OpenNews fellow and I’m loosely affiliated with the Guardian’s data science team in London. We do online audience research, build tools and run experiments that help our developers and journalists hopefully make more informed decisions about what they build and write.

I was born and raised in Belgium, but not the German speaking part of Belgium, so my thanks as well for organizing this event in the language of Britney Spears and Dan Brown instead.

Some of you have probably heard of cargo cults before, but for those who haven’t, allow me to briefly explain to you what they are.

Back in 1942, after the attack on Pearl Harbor had forced the United States into war with Japan, the US realized that they’d have to establish forward operating bases in the Pacific to defend against Japanese advances. So they set up hospitals, cleared roads and built airstrips in places like Vanuatu and the Solomon Islands.

Colonization by the French and other European nations had long ago given many Pacific islands a taste of western civilization, but nothing quite like the American war machine.

It must have been a strange sight to the islanders, to see their homes transformed into high-tech forts in a matter of weeks, depots and armories filled to the brim in preparation of a long, hard war.

The Vanuatu and others would get to partake of some of the riches, as soldiers would share food and medicine with the natives in hopes of recruiting them as guides and laborers.

After the war ended in 1945, the Americans sold various of their bits and bobs to the islanders, dumped and destroyed much beside, and then the mighty US army was gone.

The islanders didn’t know how the Americans had procured all of that stuff, and so fast too, but they’d sort of gotten used to it and wanted more. They figured, if the Americans can secure favors from the gods and get precious cargo delivered right to their doorstop, all doing little more than waving their hands at the sky, then so can we.

How hard can it be, said the islanders. All we have to do is copy these funny foreigners in khaki and the cargo will come. So on an island like Tanna you’d have people meticulously maintain air strips and drop zones. They’d put on uniforms and wave around makeshift signal cones, the ones you use to tell a plane where and how to land. Cultists would construct and man communications shacks and talk into radios with nobody at the end of the line. They had gotten a taste of modern technology, medicine and entertainment and they wanted more.

But try as hard as they might, to Tanna the cargo never came.

Now let’s talk about the news industry.

The news industry has been looking for its own savior for quite a while, too. (Might it be Warren Buffett? Or Jeff Bezos?)

From our shrinking newsrooms we see all these internet startups create millions and billions in revenue out of thin air, sometimes with the most ridiculous of business models. And we get to thinking: how hard can it be? In Silicon Valley they’ve figured it all out for us, so now we don’t have to figure it out for ourselves. We can have a piece of that pie. We just need to be like them.

And so we turned to data, and we turned to analytics. Because that’s what Silicon Valley did and you can’t argue with the results.

Web analytics were pioneered by a company from Utah called Omniture in 1996 and a company from San Diego called Urchin in 1998. Omniture is now Adobe Analytics, Urchin became Google Analytics.

You’ve probably used Google Analytics before so you know what it looks like. It’s a smorgasbord of numbers. You’ve got your demographics, your pageviews, referrers, time on site, time on page and frankly not much time before your head explodes trying to figure out what everything means.

I’m a fan, though. If you’re a web developer you’re probably addicted to the “technology” tab, which shows you how many people still use Internet Explorer 6 and 7. Ugh.

When The Guardian decided it wanted to become a global brand, one of the first things we did was fire up good ol’ Omniture to figure out which English-speaking countries already had a loyal Guardian online readership. Canada? Maybe. Australia. Ding.

But that’s not how most people use Google Analytics. If you’re like most people, you don’t stray very far from the dashboard you get when you log in. You stare and squint and hope insight will magically manifest itself.

Or maybe that’s not true and maybe you know exactly what you’re looking for. Pageviews for example. Because pageviews imply ad impressions, and ad impressions imply revenue.

I used to work for a local newspaper and TV station in Cedar Rapids, Iowa. It gets really cold there during winter. Snow piles up in huge mounds alongside the road that can take weeks to melt back down. I’ve heard some of you complain about winter in Berlin. You are wimps.

Iowans love local news. They love it because they get the latest forecasts, traffic reports, a heads-up on school closings and a myriad of other information that makes your life so much easier when the world around you has frozen over.

When the weather is particularly bad, pageviews are particularly high.

But then March becomes April and April turns to May, and pageviews go down, and then go down some more.

But as of yet it’s impossible to manipulate the weather, so seeing those pageviews go down is not something you can really do anything about. You can set all the performance targets you want, but you can’t well fire your editor for not turning rain into snow.

Pageviews is a vanity metric: something that looks really important but that we can’t act on and that tell us nothing about how well we’re actually doing, financially or otherwise.

There’s another reason why Google Analytics doesn’t always deliver on its promises. There are perhaps a couple thousand newspapers in Europe, but there are millions of businesses and stores and most of those are now online. Unsurprisingly, then, most of the advanced features in our analytics tools cater to the needs of marketeers and online stores. Funnel analysis, conversion rate tracking, those sorts of things. Features that help news organizations are an afterthought.

But we don’t care that our dashboards don’t actually help us.

There’s nothing like a dashboard full of data and graphs and trend lines to make us feel like grown ups. Like people who know what they’re doing. So even though we’re not getting any real use out of it, it’s addictive and we can’t stop doing it.

But after a while you just don’t get quite the same high from your dashboards that you used to. You’ve habituated. We still look at Google Analytics, but at this point metrics like “unique monthly visitors” bore us. They were always useless, but now they’ve stopped being fun, too.

Lucky for us, like vodka, there’s now a new flavor of analytics to try every week.

There’s enough social media analytics tools to merit listicles that helpfully introduce you to the top 8.

If you use Disqus for comments, ScribbleLive for live blogs or UserVoice for feedback, you can sleep easy knowing that all of those have built-in analytics.

All of them show graphs and list numbers that look very, very interesting, and with which you will do very, very little. (It’s okay, we’ve all been there.)

A couple of years ago live analytics was the new hot shit, pioneered by Chartbeat. Is useful knowledge to be had? Absolutely, unequivocally, yes.

Imagine your newsroom has been pumping out articles about the papal election, yet it turns out that the article readers are clicking on is one about the civil war in Syria. Thank God, you can finally stop writing about the goddamned pope because people don’t care anyway. You can commission a new piece on Syria, perhaps even tailored to people’s exact search terms so you’re not just writing about what they care about, you’re answering their questions too. I think that’s incredibly useful information to have, information you can act on, right now.

Except I’ve never seen a news organization that has a workflow that would allow them to routinely respond to their readers’ behavior right now. Content farms seem to be the only content producers with that capability.

If the pace at which you receive new metrics outstrips the pace at which you can change your newsroom’s priorities, then what’s the point? People may care about Syria today, but tomorrow they will have moved on.

This is how live analytics are being leveraged in the newsroom:

“Hmm. It looks like the horse meat scandal’s exploded, it’s all over the web and our readers can’t get enough of it. Good to know. I’ll give a shout to the editor next week when he’s back from vacation so we can maybe do a follow-up.”

Good thing live analytics were there to save the day, aye?

Now, you can’t really blame analytics tools for the stupidity of their users. There are proper uses for all of the tools I’ve mentioned from Google Analytics to Chartbeat to Attensity.

It didn’t take long for tech startups to figure out you can actually get people to pay money for tools that don’t even measure anything but instead make the measurements from other tools look pretty and professional. Cue dashboards like Cyfe, Geckoboard and Leftronic.

You’re supposed to put these dashboards up on a wall, on a huge plasma screen. Because of course numbers are twice as persuasive if you make them twice as big.

But in our quest for wasting more money, faster, there’s a new contender and it takes the crown. Big data.

Big data is any kind of data that can’t be stored or analyzed on a single machine – sometimes because it’s too big, sometimes because it’s too complex. The human genome can be encoded in about 5 megabytes of compressed data, but it can be big data too depending on what you want to do with it.

Most of the analytics we’ve been using thus far have been aggregates or samples. Aggregates drill down the behavior of many different users into a single number: the average time on site, all of today’s pageviews. Sampling allows us to create those aggregates based on 1 or 5 or 10 percent of all our users, rather than the whole bunch. Aggregation and sampling make data smaller.

But sometimes we want to look at how each individual user interacts with a product, each action they take and why they take it, what they click on and what they ignore. That’s what big data allows us to do.

As behooves a proper cargo cult, we’re not doing this big data thing just because, we’re doing this big data thing because we have seen it work to great effect in other industries.

By looking at what books and movies each and every customer buys and clustering people together who have similar tastes, Amazon is able to provide product recommendations that are pretty damn good, without any human intervention.

Many tech startups have at this point implemented some form of what is called lifecycle marketing. With the help of machine learning, statistical techniques or sometimes nothing more than a calendar, companies figure out when their customers are likely to need help, when they’re likely to spend more, when they’re likely to cancel their subscription. With that information, they send out tailored emails engineered to keep you using their product. Perhaps an extra discount or an offer to enroll you in a training course. The effects on user retention can be nothing short of miraculous.

Talking about big data, you may have heard that story about Target, an American supermarket chain, who can figure out with reasonable certainty, based on nothing but one’s purchase history, whether or not one of their female shoppers is pregnant, even the stage of the pregnancy. Target will then tailor the advertising they send you: diapers instead of beer.

Big data is making Target and other retailers so much money it’s ridiculous.

What we forget is that Amazon and Target had very specific problems they wanted to solve. How can we make better recommendations so people buy more? How can we mail out coupons that don’t immediately get thrown into the trash?

If you want to make big data work, you have to be really specific about what you’re trying to achieve. With big data, mucking about doesn’t just mean wasted effort, it means hefty bills for computer clusters and for storing those terrabytes worth of data. And hefty bills without much to show for it is exactly what I fear might happen as more and more news organizations get on the “me too” train and experiment with these techniques.

But the news industry doesn’t care much for cost-benefit analyses, I suppose.

I’ve mentioned three false starts, and I’ll repeat them to you.

We’ve deluded ourselves with dashboards that give us nothing other than information overload.
We figured that if one piece of analytics software doesn’t solve our problems then maybe ten of them will.
We thought, maybe we don’t need more software, we need more data, not realizing that having terrabytes of data isn’t a solution, it’s yet another problem because how are you going to plow through it?

There is so much potential in data and I hope in the future we can do better.

It’s not actually that hard to do better.

Here’s one weird trick I learned from Eric Ries. No, it’s actually more like a four-step program.

figure out what is important to your organization, what your goals are
think of a couple of ways in which you could move the needle on one of those goals, pick a project
assemble a team that will actually execute said project
then, and only then, think about a good metric the team can use to see whether they’re making progress.

The metrics you’ll need will depend on your business, but as a starting point I like Dave McClure’s list of five startup metrics for pirates:

acquisition: finding new users
activation: getting users to give your product a try
retention: making sure those users stick around
referral: have your loyal users invite others
revenue: hopefully you get to make some money from all this

Acquisition, activation, retention, referral, revenue, or, as an acronym: aarrr! Those are usually the big five things you have to worry about.

Good analytics focus on numbers

that matter
that you can do something about and
that reflect your performance.

During big events like the Olympics a lot of news organizations’ pageviews go up as people are hungry for information. In contrast, the amount of return visitors probably doesn’t change much, and if anything the ratio of regular to new visitors might go down a little.

It may sound counterintuitive but stability in the face of statistical noise makes a metric like the daily active user to monthly active user ratio (the DAU/MAU ratio) a much nicer number to work with than pageviews. Pageviews can tell you how much money you’re making today. DAU/MAU tells you how sticky and addictive your website is, which is what determines sustainability and potential for growth in the long run.

More important to remember is that metrics only make sense when you’re going to do something with them. If you just have them to have them and don’t use them to gauge your progress in making your website, your application or your writing better, then you might as well not have them – even if they’re stable metrics that measure the right thing.

Metrics are for doing, not for staring.

Here’s an example.

At the Guardian, a problem for both journalists and designers is that we produce a huge amount of content. Often more than 500 articles on a weekday. A lot of it sort of gets lost. Journalists hate it because they spend lots of time writing something that few people may ever see. Designers hate it because it’s hard to showcase the Guardian’s content when there’s so much of it.

So what are we doing about it?

Well, for one, our next gen web team is working on design templates that make it easier for people to get different angles and background information about big, important news stories. Our new mobile website (and future cross-platform responsive website) makes it easier for people to know what’s on offer and what to read next with the help of story packages. It also makes it easier to browse the site and read something random: just swipe!

We hope these things will make people stick around for longer, and read a wider variety of stories. And those are exactly the metrics we use right now to evaluate and tweak these features.

Back to the four-step program:

Overarching goal: more loyal readers, reading more
Project: encourage readers to spend time on less prominent articles
Team: multiple designers, coders and testers working together
Metrics: pages per visit, pages per day, # of underperforming articles in a story package

Not only do we keep track of those metrics on an ongoing basis, many of the changes the next gen web team make are A/B tested. Those A/B tests compare a new or improved feature against the website as it currently is, by giving some people the new version and some people the old version.

A/B tests save you from having these endless discussions and meetings. Do the metrics go up? Keep the feature. Do they go down? Throw it out. Don’t know? Stop talking and try it.

We’ve been helping these guys out a little bit with the data science team. We’re mapping out when best to publish articles and video to make sure it’s seen. In other words (not mine), is the website ever full?

The web provides us with an infinite amount of space. But our readers do not provide us with an infinite amount of their attention. So we’re thinking, would it help if we published more on weekends (when we traditionally publish less) or does that not really make a difference? Is there a perfect time of day? Or maybe it’s not so much about when we publish a story but about how and when we promote it on social media? What kinds of stories stay popular for long amounts of time, which ones disappear off the radar?

I don’t have the answers but I’m going to find out. But not by looking at a dashboard. If you want real answers, you need to do real work.

And we’re tieing it all back to revenue, so we can answer questions like how much does it cost us to write a movie review, how much does it pay? It sounds straightforward (and crucial) but I will eat my left sneaker if any of you work at a news organization where they have basic revenue information like that.

It’s early days, still, for data science at the Guardian. We fuck up too. In fact, right behind me at the office there’s a giant TV that displays nothing but the number of pageviews and a big red or green arrow to indicate whether they’re going up or down. Yeah. Um. D’oh.

But The Guardian is also slowly but surely figuring out how to do analytics for real.

That’s why we call it data science: ask questions and draw up hypotheses, collect the data needed to test those hypotheses, and verify.

Sometimes we need big data, often we use Adobe Omniture like everyone else.

We have a live analytics platform, but our modus operandi is the long-term research project.

And I honestly can’t recall the last time I’ve looked at our pageviews. I know it wouldn’t get me anywhere.

What we are finally figuring out that if we want to get serious about creating better news websites and doing better journalism, it’s not necessarily Silicon Valley’s latest tools and trends we need, it’s their methodology of measurement, iteration and adaptation. A scientific way of looking at what works and what doesn’t. Analytics we can act on rather than just numbers we can look at.

As an industry, we’ve historically been really, really bad at analytics. The news industry needs to grow up. Never measure just because you can. Measure to learn. Measure to fix.

Because if you continue to pretend, true insight and information you can act on will continue to escape you. In the Melanesian Pacific, they’re still waiting on cargo that never will come.

share on twitter

Cargo cult analytics debrouwere.org/68 by @stdbrouw

Stijn Debrouwere writes about statistics, computer code and the future of journalism. Used to work at the Guardian, Fusion and the Tow Center for Digital Journalism, now a data scientist for hire. Stijn is @stdbrouw on Twitter.