An industry changing as quickly as the journalism and media industry must develop new ways of doing business in order to survive. Open data may be a very significant piece of that puzzle.
CityCamp NC is an annual civic tech event bringing together businesses, citizens, policy makers, and researchers to discuss innovation to improve life in their communities. CityCamp NC 2017 begins on Thursday, September 28 at 5:30 pm and ends on Saturday, September 30 at 4:00 pm.
Ryan spoke with us ahead of his talk, “Using Open Data to Build News Tools and Products,” about utilizing data in three areas: court records, police incident reports, and voting and election data.
What sparked your interest in this topic?
The modern use of data in journalism really got started by former UNC professor Phil Meyer, whose book Precision Journalism has long been the first stop for anyone who wants to learn about the topic. Data is the most important public record format for improving government transparency and understanding. Modern journalists must understand how to acquire, clean, analyze and present public data.
The economics of acquiring these data skills, though, right now don’t make sense. If a person has the ability to understand and communicate insights about data, they can make a lot more money in fields other than local journalism — although few are as important and personally meaningful.
That’s why I’m interested in working on developing useful products that citizens can use to not just understand problems in their communities, but that can guide them to personally relevant solutions.
How do you sort through the thousands of pieces of data to find compelling data that lead to answers to current problems?
The most important thing is to understand the problems that the people you’re serving are trying to solve. There are infinite tidbits of news that might come out of any dataset, but these are often entertaining at best and distracting at worst. To do that, you have to be constantly observing and interviewing your potential customers.
Of course, the data can also can provide insights. And you find those by interviewing the data similar to the way you would interview a human source. You have to find out what normal is, and then you look for outliers.
Of course, we always have to be on alert for what the data isn’t telling us — either because it isn’t complete or because we are misunderstanding how and why it was collected. Data is just another source that needs to be triangulated.
What is necessary to take this from an exclusive “computer nerd” platform to helping journalists hold public officials accountable and inform the public?
I’m going to come back to audience again. You have to understand your audience even better than they understand themselves so that you can develop satisfying experiences and surprising stories. Ask yourself: What do I know that, first, the audience doesn’t know and, second, would be useful for them to know?
As often as I see reporters who don’t know how to use data, I see statisticians who don’t know how to tell stories. The more we can mix those two in the lab, the more likely we are to have an innovative development.
What is a viable next step to bringing data to local newsrooms? What infrastructure needs to be place to utilize data journalism in small newsrooms with fewer resources?
Most newsrooms right now have access to more public data than they know what to do with. Although, paradoxically, some of the public data that would be most useful to small newsrooms isn’t readily accessible.
But to answer that question in more satisfying detail, we’re trying to practice what we preach about knowing our audience. Reese News Lab fellow Melody Kramer has been interviewing journalists and others to better understand where we might use data to solve their problems. She’s going to have a report coming out in the next few weeks that we’re looking forward to sharing.
I think it’s important here not just to think about our job as narrow as just supporting small newsrooms with fewer resources here. My interest in media product development and computational journalism is in building tools and techniques that empower communities with information.
Community newspapers have been one of the most powerful avenues for doing that, but we might find that in fact librarians or local advocacy groups or other kinds of businesses are also vehicles though which public data can be used to fulfill all the traditional missions of journalism — holding powerful people accountable, shining light in dark places, explaining an increasingly complex and interconnected world and connecting diverse skills and perspectives to solve shared challenges.
Where do you (and Reese News Lab) go from here?
It’s critical that we get as many diverse prototypes in front of our target audiences as fast as possible. Our first two key audiences are professional journalists — broadly defined — in North Carolina and curious citizens. We know what is feasible with a wide variety of low-hanging public datasets in North Carolina. Now we have to figure out what’s desirable and financially viable.