MFG Archive

What We Talk About When We Talk About Data

The world of nonprofit data is full of buzzwords and jargon that gets tossed around a lot, often indiscriminately. Some of those words have specific definitions that are subtly—and sometimes completely—different from how they’re being used, while others are vague to the extent that they become almost meaningless.

What do we mean when we talk about measuring “impact” or using “Big Data?” Both those words have clear definitions, but are often used ambiguously. And we often describe data as “unstructured,” or talk about a data tool having an “API”—an Application Programming Interface—as if those terms mean something specific and clear, when in reality both are vague enough to leave a wide range of possibilities about the intent behind their use.

If anything, the problem is getting worse with time, as if we all know we should be thinking about these big ideas and throw around important-sounding terms without knowing exactly what they mean. If definitions are used inconsistently, what one person hears is not necessarily the same thing the other person said. The fact is, there are already so many obstacles for nonprofits trying to use data more effectively that we don’t need another one—especially one as potentially damaging as a communication gap. It’s critical to know what terminology actually means when you use it, because it might mean something else to your audience.

I’ve come up with a list of “usual suspects,” words used incorrectly so often that they’ve almost lost their meaning. At this point I think we’re better off avoiding these terms and adopting new, more-specific ones, as I’ve outlined below.

Impact. People throw the word “impact” around all the time. Sometimes it’s a synonym for “results,” or “important outcomes.” Other times it means “something measurable,” or even more vaguely, “some important piece of information about programs that’s more than just how many people showed up.” But the word has an actual meaning—it’s the change you see when you do something, like an experiment or an intervention, over what you would have seen if you had not taken that action.

“Impact” is quite straightforward to define. If you have a control group, the results you see over the results from the control group is your impact. It’s just incredibly difficult to measure in any meaningful way in the nonprofit world. Why? Because to measure any real impact, you have to take into account all the things that could have also had an effect and make sure that you’re not just measuring any of those things.

At Idealware, we’ve replaced the word “impact” with the term “attributable impact.” While it’s somewhat redundant, the term by definition points out the importance of being able to actually know that your program was the primary cause of the change you want to see. It stresses the importance of being able to correctly attribute that impact to your program rather than measuring changes caused by other factors.

In fact, “impact” is often used in such a fuzzy way that I would say it’s no longer a useful term. You’re better off specifying exactly what you mean with a more specific term.

Big Data. This term is often used to mean “a lot of data,” but that was neither the original nor intended definition. In fact, “Big Data” is not synonymous at all with the idea of gathering a lot of data and mining it for information. For people who deal with data for a living, the term actually has a very specific meaning: “Data sets that are too large and complex to manipulate or interrogate with standard methods or tools,” according to Google.

In the corporate world, this refers to data that is hundreds of terabytes in size, or bigger, and interrelations so complex that new types of systems needed to be invented just a few years ago to store and query the data. A good example of this is the massive marketing efforts run by major retailers—think about the big online catalog companies and all the information they track about visitors to their sites, their purchases and interests, their social media interactions, the keywords and emails they respond to, their spending habits, and more. That’s a ton of data, with extremely complex relationships, stored all over the web.

It’s really unlikely that any nonprofit would use Big Data in the official sense of the definition, but it gets used a lot just the same to means different things, including:

  • More data than we’re used to dealing with
  • More data than can fit onto our computer or into our database
  • Data that’s not structured in a way to make it easy to analyze
  • Data we can mine for insights about our programs
  • Data we can buy about particular individuals (such as their buying or donation habits)
  • Public data we could pull into our systems to add insight to our data

While people often use Big Data in any of these ways, it’s just not accurate, and it’s likely to confuse people who know the term’s true definition—or who just think it means one of the other things in the list.

Unstructured. A vendor might boast that its solution can even support “unstructured” data, or as a sector, we might seek to use “unstructured” data more rigorously—but what, exactly, does that mean? Technically, it doesn’t mean a thing. You could call data unstructured because it’s stored in a database in an unoptimal way, or you could be referring to the more scattered nature of Twitter or Facebook posts. A narrative in a word document is also unstructured, as are handwritten interview notes, or even a videotape of a TV interview. Without more information about the specific type of unstructured data that is desirable to manage, the term itself is meaningless.

API. Vendors, nonprofits, and funders will often talk about the desirability of having APIs to access data. APIs, or Application Programming Interfaces, provide methods of automatically interacting with an application, including its data. Some software packages provide an enormously powerful set of APIs that allow you to pull data in or out in very flexible ways, which can transform your ability to connect up one database to another.

But the term “API” itself refers to only a single procedure that allows you do one thing—it’s a strong and flexible collection of APIs that is useful. The confusion begins when people assume that having any API at all is the end goal when it comes to flexible data sharing, when in fact a single API is just the beginning. A vendor can say, with perfect truth, that a tool has an API when all it allows you to do is to retrieve the first name of a client when you submit a customer ID. It’s an API, but it’s not usefulcertainly not if that’s the only API the vendor offers. Similarly, creating useful APIs for a central metrics collection project is not as simple as adding on an obvious functionality. It’s actually a complex project to define precisely what APIs will be useful and why, how to design and architect them, and how to build them so that others will be able to actually use them effectively. The term “API” in of itself means virtually nothing without clarifying what the API does.

These are just a few terms to get the conversation started—I’m confident there are others, and as we become more and more data-centric as a sector, there are bound to be more. We need to close the communications gap before it gets too wide so that we can focus on the meaning of the data we’re gathering rather than the meaning of the words we use to talk about it.

 


Thank you Laura, for putting us straight and onto the same page. Readers, be sure to continue this conversation with Laura and her team at Idealware on Twitter. Let us know below or at Markets For Good what you think.