The Data Addiction

The Data Addiction

  • Conrad Wai Conrad Wai
  • June 27, 2022

The biggest problem with the proliferation of data is the belief that more data leads to more informed decision making. Smart leaders know that too much data, or the wrong kind, leads to information overload and obscures the things that really matter. Curating the right data and using it to actively learn are the keys to getting the most out of your investment.

Data is good. This is seen as self-evident in business today. Making data-driven decisions is imperative. Massive data sets are a badge of pride. With everything moving to digital, the proliferation of sensors, storage getting cheaper and processing power continuing to explode, the mantra has become, “collect it now and we’ll make sense of it later.” After all, more data leads to better answers, right? Not so fast!

When it comes to data, collection, storage, and even processing are not where the real challenge lies. We should not make the mistake of seeing data as a technical issue. It’s a synthesis problem. That’s because information is not the scarce resource. Attention is. For this reason, we need to prioritize and curate the data we seek out. What’s more, we have to realize that some decisions can’t be based primarily in data, and embrace informed intuition in parallel with being data-driven.

The spaghetti factory

Companies often argue that these days, we can just collect a bunch of data and tease out associations later. Hypotheses aren’t necessary, and, worse, don’t allow for surprising correlations the way data mining does. This line of reasoning is flawed at best. Collecting and analyzing data without framing the problem is simply not an efficient use of scarce resources. At best, the “throw spaghetti against the wall and see what sticks” approach to using data results in an extraordinary waste of spaghetti. Smart researchers realize that the magnitude of potential data, even for relatively mundane tasks like driving a car, can quickly overwhelm our most advanced technology. That’s why the key insight that allowed Sebastian Thrun and his team at Stanford to win Darpa’s $2M Grand Challenge for self-driving cars was to teach their vehicle to discard information it had previously accepted.

Even still, squandering precious resources trying to scrutinize every shred of data is not the biggest problem. A 2010 study by MIT-Sloan and IBM found that 60% of respondents believed their organization “has more data than it knows how to use effectively”. In effect, the low signal-to-noise problem is not just excessively resource-intensive; having all this data actually makes it more likely that you’ll analyze the wrong things and miss the “signal” entirely. For instance, every large retailer now collects volumes of detailed data that would have been unthinkable even a decade ago. But most don’t know what to do with all of it. Some collect mounds of customer data through their loyalty cards, but admit to not doing much of anything with it. Others will be so busy trying to unearth unusual correlations in their sales data that they lose sight of what’s important, and can’t actually answer fundamental questions like what their customers think of the products on the shelves.

Part of the problem with collecting all this data willy-nilly stems from a mistaken belief that more data means more informed decisions. Clearly, the thinking goes, our case gets stronger the more data we have, the way forward more apparent. That’s looking at things the wrong way. Even if all of the data is rich, it can become overwhelming. Though data analysis is meant to simplify decision-making, too often it merely puts into relief how challenging it is to make a good choice. This only becomes more dramatic when it comes to conflicting signals from market data, consumer research, online behavior tracking, and operational metrics.

Deliberate data collection

Rather than collecting data for data’s sake, it’s better to promote the solid design of experiments. Taking the time to scope your efforts defines the problem, signals to colleagues what’s important, and shapes what they should seek to learn. The first step is to figure out: what are you trying to test? Believe it or not, that actually makes it more likely that you will learn something valuable with the precious resources you spend.

Next, the key is to go in open to being surprised – not to force surprise by going in without any forethought. Too often, data gathering is an exercise in corroborating existing viewpoints. I’ve seen companies big and small ask people how they would use a new feature they’re testing. When the reply comes back, “But I wouldn’t use this feature,” it quickly gets uncomfortable whenever the researchers then propose, “Ok, but let’s say you were to use it…” That’s missing the bigger point. The best organizations gather data with an understanding of what they’re trying to test, and are open to being surprised. To paraphrase Marcel Proust, it’s about seeing the world with “new eyes”. This approach ensures that you’re out to learn, not just confirm.

Going even further, select companies build in mechanisms for capturing serendipitous discoveries. Remember: chance breakthroughs from penicillin to Post-its came out of well-planned endeavors, and weren’t just haphazard data-combing exercises. What makes these innovations special is that their inventors gave themselves the room to accept surprising results, recognize promise and make the lateral connections that result in big opportunities.

The limits of data

You can collect too much data. At least as perniciously, you can rely too much on data. In the business world, we have bred a culture where data is prized. That’s great, but it also carries downsides. Some decisions can’t easily be data-driven. In such situations, an overreliance on data can paralyze an organization. In particular, this comes up in a pair of crucial situations: when the problem is ambiguous, and when the future might not extrapolate directly from the past and present. No amount of data will tell you whether to propose marriage, or which new venture to start or fund. Such decisions are as much art as science. Nor will the data do you much good when the future is not a smooth continuation from the present, as is often the case in new markets. McKinsey and Co. famously extrapolated from current trend data in the early 1980s and concluded that there would be about 900,000 cell phones in use by the year 2000. As it turns out, there were close to 400 million mobile phone subscribers by that time, and 900,000 new users were joining the ranks every three days.

As Einstein once said, “Not everything that can be counted counts, and not everything that counts can be counted.” What’s needed when data is not enough is interpretation: Triangulation from multiple vantage points, reading between the lines, incorporating qualitative factors, making intuitive leaps, and having a gut sense of what’s right. This, in the end, is the solution to an overreliance on data.

Data, when combined with sound judgment, can be a great ally in strategic definition and direction. Too often, it’s mistaken as a substitute. Great leaders and organizations stand out by remembering that data is not the same thing as wisdom.