Since August, Reese Innovation Lab Fellow Melody Kramer has interviewed journalists across North Carolina and compiled research for the Carolina Data Desk. Open data is a resource currently underutilized by community newsrooms because it requires time and extra resources — two things that are difficult to obtain for local media.
How can journalists use open data to improve reporting while also performing the daily duties of their often understaffed newsrooms in a relentless daily news cycle?
We talked with Kramer about her findings and the path forward for journalists collaborating with data.
The following is a lightly edited transcript of Kramer’s interview.
00:49: What interested you in the Reese Innovation Lab and their projects in open data?
I started my career in journalism working in small newsrooms — which meant everyone wore many hats and was expected to do a lot with not a lot of resources — and so I have always been aware of the constraints of some of the shiny new journalism tools in our arsenal. This isn’t just data-driven journalism, it’s video, it’s AR, it’s social, it’s engagement. How do you decide what’s worth doing, and how do you fit that into your daily routine? It’s hard to add new things to a newsroom without dropping another task — everyone is pretty much working at capacity.
That’s a very long way of saying that when Ryan said he was interested in thinking through the future of the Carolina Data Desk, and he wanted me to reach out to data journalists and civic hackers across the state to learn about their needs and challenges, I was all aboard. For this work, I interviewed about 20 people from both inside and outside of North Carolina, but mainly inside. This ranged from someone working in a 2-person newsroom to someone working on a team of data journalists inside The New York Times. My goals were to learn about what kinds of data these reporters were using, what they might want to use, what barriers or challenges were in their way, and how they normally received training or connected with colleagues. That’s another thing I’m really interested in — when you work in a small newsroom, you can’t necessarily go to conferences because then no one is doing your job. So you have to figure out how to train in different ways.
I took all of this information and am now pulling out high level findings. There’s a hunger for audience driven data and insights because that tells us more about the people we serve. But in many newsrooms, the people looking at the audience data don’t necessarily sit with the people who work in the newsroom. So there’s an opportunity there.
03:18: You looked at similar research done in New Jersey and California. How did you use that research to inform what you learned can be used to help journalists in North Carolina?
As part of this project, I interviewed journalists both inside and outside of North Carolina and I looked at other groups of journalists in the country and world who were exploring how to find, collect, clean and use data for their newsrooms. In California, London and New Jersey, there are journalists who have already completed the work to share resources for data journalism. It’s really time-consuming to find data, to write a FOIA request, to make a public records request to receive that data, clean that data and then do something with it. A lot of newsrooms are doing all that work simultaneously, so there’s a big opportunity for newsrooms to pool resources particularly in that area.
There is a group of journalists in New Jersey, and one in California, where they say, “Okay, we’re going to go after the data together and we’re going to figure out who is getting what.” So every newsroom gets a piece of the pie as opposed to recreating a new pie. Once they have that data, the newsrooms complete their own stories.
I think there’s a real opportunity there particularly for smaller newsrooms. This is a resource-intensive initiative. When you use data, you have to take the time to find out where it is, how to obtain it which can take anywhere from a few minutes to a few days to a few weeks depending on what the data is. And then it depends on how that data might come back for you to use.
I once did a story for National Geographic on water slides. I did an OSHA request and I was really curious about how many people had been injured on water slides in a number of years in the United States. I received back the data but it was on pieces of paper. It wasn’t very useful. In that case, you might need to reenter the data or scan the data to make use of it. That’s all time-intensive and it a two-person newsroom, you might not have the capacity to take that on. But there may be ways to share the data so you can do your data entry and reporting.
05:59: It seems like efficiency and time are the two biggest factors in helping journalists come up with this kind of data.
Yeah, and I think there’s also the factor of what to do with data. Sometimes it’s hard to visualize data and there are a number of ways to use it. Often journalists have to think about: for this data source, what is the right way to tell that story? Every journalist has to consider that and there are lots of trainings for that. But often I see journalists asking each other: “I want to tell this story with this data source. How would you recommend doing that?”
06:44: The concept of data journalism may be new to some professionals. Do you think there are any misconceptions about data that need to be addressed?
Yeah, I’ll quote this piece from Jonathan Gray in The Guardian on the limitations of what data cannot do. It’s really important we don’t think of data as a force unto itself. Databases do not knock on doors. They don’t make phone calls. Data doesn’t push for institutional reform. It doesn’t educate the masses. Data is a tool that we can use but humans have to figure out what to do with that tool. Data also does not give us a perfect picture of the world. It’s a dataset compiled with a specific time and place and it might not be overly comprehensive, and it’s important to let the readers know that.
Data does not speak for itself. This is also something Jonathan pointed out. We need to learn to interpret that data and that’s where a journalist can come in and say here’s what’s absent, here’s what’s present in the data, and here are the implications of this data — and that’s reporting. These are not just numbers — this is what it means. And that’s also time-intensive and a skill set that isn’t necessarily taught in intro-level courses in journalism schools. It’s taught in specialized courses, conferences and workshops, but if you’re a small newsroom and you cannot attend those, then we have to think how do we acquire that skill set? How do you look at a story that did really well at a local newspaper in New Jersey — can we replicate that experience for our readers in North Carolina?
Also, I think it’s important to recognize that data is a tool, and it’s just one tool in your arsenal. It can help you make sense of a topic — and it can be useful for looking at the bigger picture, but then you need someone in your newsroom to translate the data in an understandable way. The reason we don’t list numbers in a newspaper or on a website is because no one can make sense of that.
What’s really neat about these data-driven stories is that it combines a lot of information on a topic with a reporter to assess what is the real story here and how does this affect my audience?
09:15: What happens next for the Reese Innovation Lab and what are some of the implications of your research going forward?
There are many opportunities here, I think, particularly for collaboration. I heard repeatedly from reporters that they were open to collaborating with other newsrooms — which is really great. Collecting data is time-consuming, so if there’s a way to pool resources together for that part, reporters can then go and do their own thing.
There’s also a lot of opportunity for working with the civic hacking community. I interviewed several leaders in the civic hacking world across North Carolina, and found that they love working with data, but they’re missing the storytelling components. So there’s an opportunity for them to work more closely with journalists.
Lastly, I think there are several pathways for the Carolina Data Desk and Reese Innovation Lab. Could students serve as consultants to newsrooms across the state to help answer questions and teach? Could there be partnerships with others across the University of North Carolina who are working with data, like researchers in the medical school or school of public health? Could the Data Desk help newsrooms make sense of audience data?
One project I loved in Philadelphia looked at data for every zip code in the city. That’s not only compelling for a person who lives in that zip code finding more about where you live; it’s also compelling for a newsroom to find out more about its audience and how they’re using the site.
Here in North Carolina, I talked to Tyler Dukes at WRAL. He told me that data informs every one of the stories his team does. One example he pointed me to was some work he did recently on redistricting in the state. Anyone can enter their address to find out whether your state House and Senate districts — and your potential incumbents for the 2018 election — may have changed. That’s a great example of using data — and really thinking about what people would want to know. In this case, they want to know: “How does this new rule affect me?” So you’re typing in your address and asking if your address changed. WRAL can also use the data they’re collecting to see where their users live across the state who took the survey. What does that say about them? Are there ways we can build a revenue source around these people?
Collecting the data is one step, but then being able to take a step back and analyze the data is often the missing second step. We are constantly dealing with a news cycle of moving forward, and where I see a role for UNC and the Carolina Data Desk, is helping newsrooms take a step back and reflect.
13:07: Kramer reflects on her personal experience working on this project.
I loved doing these interviews across the state. It was inspiring to hear the common goal across all of these newsrooms is to inform the public. I was really impressed with the amount of work that someone in a two-person office in Asheville could do on a daily basis — that is hard! When you see the range of how people are working in news and the challenges that everyone who works in the news industry is dealing with on a daily basis – cuts, advertising models shifting, and just the constant influx of news. It’s really hard to take a step back and think what are we doing that’s working? What can we change? What can we learn from other newsrooms. Then I see places like the Reese Innovation Lab really helping with that. It’s hard to reflect when you’re in a daily news cycle and there’s a need for institutions and organizations to help people do that.
The more that we’re able to say, “Okay, you can work on your daily reporting and you can take these three things from the Reese Innovation Lab that will help you or eliminate your workload in this way,” then I think the stronger we are as a system.