Mit Daten arbeiten und ihnen gegenüber skeptisch sein, ist kein Widerspruch – im Gegenteil

I’m a data scientist who is skeptical about data schreibt Andrea Jones-Rooy bei Quartz. Da gibts viel zu zitieren:

Whether it’s curing cancer, solving workplace inequality, or winning elections, data is now perceived as being the Rosetta stone for cracking the code of pretty much all of human existence.

But in the frenzy, we’ve conflated data with truth. And this has dangerous implications for our ability to understand, explain, and improve the things we care about.


“What does the data say?”

Data doesn’t say anything. Humans say things. They say what they notice or look for in data—data that only exists in the first place because humans chose to collect it, and they collected it using human-made tools.

Data can’t say anything about an issue any more than a hammer can build a house or almond meal can make a macaron. Data is a necessary ingredient in discovery, but you need a human to select it, shape it, and then turn it into an insight.


Data is an imperfect approximation of some aspect of the world at a certain time and place.

Companies—and my students—are so obsessed with being on the cutting edge of methodologies that they’re skipping the deeper question: Why are we measuring this in this way in the first place? Is there another way we could more thoroughly understand people? And, given the data we have, how can we adjust our filters to reduce some of this bias?


This doesn’t mean throw out data. It means that when we include evidence in our analysis, we should think about the biases that have affected their reliability. We should not just ask “what does it say?” but ask, “who collected it, how did they do it, and how did those decisions affect the results?”

We need to question data rather than assuming that just because we’ve assigned a number to something that it’s suddenly the cold, hard Truth. When you encounter a study or dataset, I urge you to ask: What might be missing from this picture? What’s another way to consider what happened? And what does this particular measure rule in, rule out, or incentivize?

Schreibe einen Kommentar