The Data Diet: How Data Journalism Becomes Sustainable

By Ken Romano, Director of Product at The Associated Press

My pandemic media diet has been mostly newsletters and podcasts. I start my day with The New York Times and Morning Brew, and I end it questioning whether the CIA wrote one of the great post-Cold War rock ballads while cooking dinner.

The bar to being a content creator has never been lower. Particularly in audio (Anchor), video (Instagram Live), photo (iPhone) and text (Substack).

What about data?

New tools are making it easier than ever for journalists to produce stories rooted in data. As a data journalist (or an aspiring one), you can use Google tools to clean data. Platforms like can help you catalog and distribute it. Datawrapper can help you visualize it.

But, rather than a passive experience like listening to a podcast or reading a newsletter, the reward for a consumer of data-driven journalism is…more math?

There's an extra cognitive step to interact with data. What’s the source of the data? What's the sample size of the survey? COVID cases are rising, but what's happening to the percent of positive tests? And in the same way that bad actors can edit a video to manipulate the facts, data can be selectively presented.

Data doesn’t have to yield more questions than when you started.

At AP, we believe a world in which more people know what’s happening around them is better equipped to face the challenges we share, so for the past several years, we have cleaned, vetted and distributed newsworthy data to newsrooms across the country. Using our work as a springboard, journalists can then [relatively] easily localize the data and stories to see how they affect their communities.

The payoff to the reader can be huge. Data conveys differences across communities in a fraction of the time it would take to do so through traditional reporting. It also leads to stories you couldn’t have otherwise discovered.

But storytelling and sticking to the facts are essential. Here are three lessons we've learned about distributing data journalism:


1. Remember your audience learns in different ways.

Most likely — unless you are targeting a very niche, expert audience — your readers will have varying levels of data literacy. AP uses distinct tools when we build data stories: 

  • For the beginner, we provide text using natural language generation technology that turns a dataset into a few paragraphs of text, customized and localized for your community. 
  • For the casual user or the visual learner, we provide maps, lookup tables and other interactive elements to bring the data to life in a browseable way.
  • For the expert, we provide the raw data itself to be analyzed.  


2. Make it news you can use.

We have found the stories that perform best affect readers personally: 

  • What is the life expectancy in my neighborhood, and how does it compare to the rest of the city? 
  • Do the students in my school district have appropriate access to computers?
  • Are the hospitals in my area close to capacity? 

Even when AP works with financial firms, we try to get to the heart of the trading decisions they need to make based on AP data. 


3. Be transparent.

AP's data team take extraordinary lengths to ensure our data cannot be misinterpreted. We understand how detrimental false information can be, and at AP, our mission is to advance the power of facts. When we provide data to other news organizations, it comes with an extensive data dictionary to ensure no field is misunderstood. We call out specific caveats on what the data won't tell you, and we identify any data left out because it didn't reach our standards. When an investigation is published that relies on intricate data, we publish sidebars that explain the methodology.

Data is an emerging storytelling tool. And the biggest change is actually no change at all. It's simply applying the same journalistic rigor, standards and storytelling to a new format.


Ken Romano is director of product at The Associated Press. To learn more about AP’s data solutions, click here.

Ken will be speaking on "Data Science and Fact-Based Journalism During Times of Crisis" at our DSS Applying AI & Machine Learning to Media, Advertising & Entertainment event;  register here.

Get the latest data science news and resources every Friday right to your inbox!

Sign up for our newsletter