IA for news websites: a link dump

written by 
A link dump companion to my IA for news websites series.

I promised I’d do a link dump about information architecture for news websites , so here it is. For a real link dump you should check out my delicious pages for tags like journalism , newspapers and ia, but I’ve compiled a list with some articles that really stand out for me.

Why do it?

My one-page bible on the what and why and how of an information-centric approach to the design of newspaper websites is still Adrian Holovaty’s 2006 post A fundamental way newspaper sites need to change. As Holovaty explains: “I’ve only met a handful of people who became journalists because they like information. And I think that helps explain why there have been some major cultural issues in the journalism world in the age of the Internet”.

Holovaty’s thinking is the result of his work over at lawrence.com and The Washington Post, which means that it’s equal parts vision and a progress report on actual work being done right now. We need to talk more about the merits of metadata, not just among ourselves, but with a lay audience in mind. Real examples like the ones Holovaty provides really bring some life to the discussion in a way a technical overview cannot.

Ben Hammersley did a little series in december ‘09 and january ‘10 about why media orgs need to adopt a metadata-heavy create-once-publish-everywhere strategy:

So why do everything you can to keep metadata intact? Because it’s from this information that new products can be automatically created, at a scale and rapidity that would be impossible otherwise. With every piece of metadata that you don’t throw away, you gain a factor more potential ways of slicing through your content and delivering it as a separate product, simply as a result of a database lookup.

Martin Moore noted a few months ago that “On a news organization’s list of priorities, publishing articles as ‘linked data’ probably comes slightly above remembering to turn the computer monitors off in the evening and slightly below getting a new coffee machine.” and he outlines ten tangible benefits to publishing in linked, structured data.

The Basic Unit of Information

Talk about “the basic unit of information”, that is, how to structure and present news content, is actually four discussions rolled into one.

  1. A discussion about information architecture (IA) : how should we structure our data; how can we leverage that structure to provide a better experience for our readers, contextualize stories and make our content keep on giving (e.g. by making it repurposable).
  2. A discussion about knowledge management : how can we make a platform for reporters that makes it easy to browse through past stories, build up a knowledge base about a beat, and have easy contact management .
  3. A discussion about ethics : should we make available the basic ingredients of our work: the facts, quotes, documents and recordings (when those are on the record) we use to write our stories?
  4. A discussion about the semantic web and linked data : an effort to make the web machine-readable.

If you’re interested in the discussion about IA, you should read, err, my series , as well as one that Daniel Jacobson did over at Programmable Web . Jacobson outlines some of the stuff they do at National Public Radio. You can also read an interview with Daniel at How Software Is Built. I also like Dan Conover’s scenario describing how the job of a reporter will look like in ten years .

Information architecture for news websites is in large part a matter grasping and modelling the problem domain of the beats you cover. If you’re technically inclined, definitely read Domain-Driven Design by Eric Evans. It’s a huge tome of a book, but reading it pays off. Reading up on software requirements gathering, although it may seem tangential to the issues at hand, can also teach a few techniques on how to structure your thoughts and turn them into a workable system. If you’re not a programmer, that’s okay. Read Dan Roam’s The Back of the Napkin instead.

I’ve written a little bit about the ethical dimension to BUOI myself. See some of the blogposts by Jeff Jarvis, Chuck Peters, Matt Thompson, Dave Winer and some others for more on that (usually interspersed between remarks about contextualization).

Knowledge management in journalism is something that is on a lot of people’s minds, as they think about how to improve beat coverage using wikis and crowdsourcing. But turning the newsroom into a knowledge-gathering operation has always been more of an ideal than an actual battle plan. It should be discussed more prominently, though, because knowledge management will become ever more important as we adapt to economic realities and try to do more with less. Maybe Minnesota Public Radio and its Public Insight Network will start some fires — it looks really cool. Read more about that over at Daniel Bachhuber’s blog.


Information-centric design isn’t easy. Not everything in the world is as neat as we’d like it to be. which means that structure can sometimes strangle rather than liberate our content. Structure that doesn’t do right by its problem domain won’t work. Because the world is smushy, the way we manage our content has to be a little bit smushy too.

Two good reads on structure, metadata and information architecture on the worldwide web are Ambient Findability by Peter Morville and Everything is Miscellaneous by David Weinberger. Morville and Weinberger don’t necessarily always agree with each other, which is what makes it so interesting to read both of these books together. There’s a fun back-and-forth between the two over at Peter’s blog you should check out.

I’ve done my best to keep mentions of the semantic web to a minimum in my series on information architecture, because I think IA and semweb deserve to be treated separately. Then again, they’re both about bringing order to chaos and about making some part of our content machine-readable, so some of the debates and doubts surrounding the semantic web definitely provide meaningful background info for information architects.

Jonathan Stray gave a good critique of the general trouble with some dream scenarios for structured information and relationships in news reporting. You can read his thoughts both in the comments to my first post on tagging and relationships and on his own blog.

To get more of an idea why some people still balk at mentions of the semantic web , see what Timothy Falconer has to say about the matter. Weinberger quotes him saying it’s not the Semantic Web’s fault that some people are compulsive”. And indeed, when adopting an information-centric strategy, we should beware of zealotry and keep a firm eye on the return on investment we expect on getting. Check out Tim Berners-Lee on what “ontologies and relationships can and cannot do . It also pays to read a discussion over at Reddit where the developers behind The Onion argue that tags might be a bit too smushy.

Navigation and search

Martin Belam wrote an in-depth look at the navigation of 9 british newspaper websites circa 2009. He has also written about keywording at the Guardian. Very insightful stuff.

After returning from BarCamp NewsInnovation in Philadelphia recently, Lauren Rabaino recently gave a welcome overview of some basic design patterns in how news sites present the news on their front pages. She also talks a bit about how current ways of navigating news websites don’t cut it:

We still categorize stories under sports, arts, news, opinion, etc. because this is how the print product was laid out. But is that what’s relevant to readers? I know that when I browse news, I don’t care about the topic. I care about the timeliness and its relevance to me, no matter what “section” it falls within. I don’t necessarily want to read about crime and sports, but if it’s happening within a three block radius of me, then I do care.

If you’re interested in search, be sure to read Search Patterns, a small book by Peter Morville and Jeffery Callender. You can read an interview with Morville over at Johnny Holland. They also have a great search patterns library online that contains a bunch of screenshots and diagrams that describe many different search patterns. The great part is that you’re free to use their material in your own publications, so if you think and write about search, be sure to browse through the collection.

A CMS versus a framework

When building a platform for the future of news, you’ll have to decide whether to modify an existing software package / content management system to suit your needs, or whether you’d rather build your own. Both approaches can work, but I’m slightly in favor of using a basic framework and building your own CMS on top of that. That way the entire newsroom backend will be finely tailored to your specific needs, instead of being chained to assumptions made by someone else. You can learn more about the pros and cons over at Sunlight Labs, Hacks & Hackers and Scot Hacker’s blog.

Context and the read-state

Bringing context to the news has been a huge topic in 2010, culminating in a SXSW session about The Future of Context past March. For the general audience the future of context is probably not exemplified by the great blogposts from Jay Rosen and Matt Thompson, but by the fuzz around Google’s living stories. All the same, as long as it gets people talking about these issues.

Amy Gahran sums up the (hoped-for) shift away from breaking news towards insight:

today’s journalists can — and probably should — consciously shift away from jobs that revolve around content creation (producing packaged “stories”) and toward providing layers of journalistic insight and context on top of content created by others (including public information). Finding ways to help people sort through info overload is far more valuable than providing more information.

Another very valuable post is Mobile First Strategy? No! by Paul Nus. He talks about how people are interested in different things at different times, and that we should provide readers with a measure of control over how they consume the news. The technology to make that happen should focus on being platform-agnostic rather than mobile-first, hence the title.

Finally, we should remember that context is personal, which will make it all the more difficult for us to get right. Be sure to read Josh Cohen’s thoughts about read-state — how much any given person knows about a topic and how that influences how they will perceive the news.

A postscript

As I write up this link dump, I can’t help but feel excited about the future of journalism. 2007 and 2008 were about lay-offs, about paywalls, about saving failing businesses. 2009 and 2010 are about news startups, about doing fresh things, about new business models and about putting to work all the hard thinking that we’ve collectively been doing. I wouldn’t want to be in any other line of work.

share on twitter

IA for news websites: a link dump debrouwere.org/1i by @stdbrouw 

 writes about statistics, computer code and the future of journalism. Used to work at the Guardian, Fusion and the Tow Center for Digital Journalism, now a data scientist for hire. Stijn is @stdbrouw on Twitter.