Making big data deliver

Investment in data science can reap great rewards. If it doesn’t yield what you hope for, ask where you’re going wrong


As the behavioural economist Dan Ariely once tweeted, “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.”

He made a good point. Grasping the potential of big data, recognising its limitations and seeing how it can be harnessed are not easy. Maybe your company has been exploring the use of big data for a while and now you’re wondering why it hasn’t revolutionised the way you interact with your customers or transformed your internal processes. Maybe you even hoped it would revolutionise your whole business model, and yet it hasn’t. You would be in good company. There is a gap between companies’ expectations of data science and what data science can actually do.

It’s not that big data itself is failing. Big data is just a set of methods (see box below). But how can you set up the data science teams to harness the power of big data for you? The success or failure of your data science team doesn’t depend on information systems or computational power. Data and infrastructure help but, in our experience, it’s rarely the reason why data scientists fail to instigate fundamental change and create value. More often it’s because their skills aren’t being properly used. Here are four common mistakes.

1. You give your data scientists the wrong problem to work on. Your organisation is asking questions that are impossible to answer; or you are demanding a solution to a problem that’s too complex to solve with clever data analysis alone. What’s at fault here is the starting point, the input to the data science “process” – so don’t be surprised if you don’t get the output you want. You’re wasting resources, missing opportunities and undermining trust in the data science team’s ability to improve business performance.

Learning the lesson: An online retailer asked an analyst in my team to provide evidence to support a business hunch that shorter delivery times would increase the conversion rate (the number of people who bought the product after viewing it). The analyst couldn’t find supportive evidence but, instead of accepting that delivery times were not the most important conversion factor, the client asked him to approach it in different ways to try to come up with the evidence it wanted.

2. The data scientists are working on the right problem but in the wrong way. Often data scientists get excited about finding novel solutions that push the boundaries of what’s possible. When they work in a corporate environment, they can become carried away and end up engineering solutions that are unnecessarily complex or too sophisticated for the problem at hand. This makes the solutions expensive and more difficult to implement – and as a result they have less impact. Choosing the right solutions – solutions that are sufficiently simple but not simplistic - requires solid understanding of the business context.

Learning the lesson: I worked on a project that tried to bring together data from several different places into one place. After a year, and with the support of those supplying the multiple databases, the completely redesigned architecture was still too unstable to use for customer-facing applications. We had been too ambitious. Our solution was over-engineered. Simpler relational databases would have allowed us to use existing skills and capabilities without compromising the customer experience.

Learning the lesson: When working for a large UK retailer, my team built a data-driven simulation of a new distribution centre. We were asked to develop complex error-correcting algorithms to account for inaccuracies in historical data and arrive at better precision. However, correcting historical errors would not have yielded greater accuracy as there were other, more material, ways in which our data was wrong. For example, at that time we were not even sure what products the distribution centre would stock.

3. Solutions are offered – but ignored. The rest of the organisation consistently ignores the work of the data scientists, failing to follow their recommendations or fully implement their algorithms. Sometimes this is because internal politics get in the way. It may be that internal stakeholders are too busy firefighting to listen to what the data scientists are saying. The data scientists can be at fault here too. Perhaps they’ve alienated their non-technical colleagues by emphasising the algorithm’s complexity and cleverness instead of how it can help make their daily lives easier. Perhaps some stakeholders even feel threatened, as if the data science team is designing solutions that are taking away an important part of what they do for the firm.

Learning the lesson: My team had had created a price-optimisation algorithm for general merchandise products such as small domestic appliances. In our tests, our algorithm improved profits significantly compared with the outcome when prices were set by the buyers. Nevertheless, we found significant resistance to rolling out the algorithm across categories. Many buyers simply did not want to use it. Those who did use it tended to be the most time-constrained. They saw the algorithm as something that would save them time and allow them to devote their efforts to other aspects of their job and therefore deliver more value to the organisation.

Learning the lesson: Working as a consultant, I developed a model to predict how many registration desks a hospital would need after it switched to a new information system. I delivered a sophisticated and interactive model that allowed the end user to change a number of inputs and assumptions. But the management of the hospital didn’t have the skills or the time to understand and use the sophisticated model. I realised this only when they asked me to give them a single-page document that explained how many registration desks they needed and where.

4. There are no effective feedback loops. The data scientists’ work doesn’t end when they deliver their solution to the client or to the implementation team for roll-out across the organisation. Feedback loops are essential if you want to see continuous improvement. Feedback loops allow your team to learn as the project is in progress and continue making small tweaks that will make it more effective. Feedback loops also enable you to take lessons from that specific project to improve the next one. Sadly, most data-science projects have no process through which feedback is communicated to the data scientists. Even if such processes do exist, they may still be useless because they provide feedback on irrelevant metrics or, more simply, nobody acts on the information provided.

Learning the lesson: My team built a predictive model which was tested and found to be significantly better than what had been in place before. It was rolled out and care was taken to monitor and audit its predictive accuracy. After a while, we found that the accuracy of the model had deteriorated. No great surprise there: this often happens as the world changes and assumptions that worked well in the past are no longer valid. Nevertheless, the data science team did not have time, or indeed any incentive, to go back and tweak the model because it had moved on to other, more pressing projects. Despite the fact that feedback was available, there was no system in place to promote continuous improvement.

So how can you build a winning data science team? The most successful teams are often built from the bottom up. Top-down tends to be supply-driven: you’ve built the data science team so you’ve got to find them something to do. It is always more better to identify problems and start building solutions that are geared towards providing quick and meaningful results. Then you nurture the team to solve progressively larger problems; you then see it gain credibility and visibility within the organisation and generate enough value in terms of savings or new business opportunities to justify its existence.

‘The gap between owners and managers has narrowed in recent years and will narrow further still’

How to harness the power of data science

1. Make sure you’re culturally ready. Are your people able to ask the right questions? Do they know which data matters? Can they interpret different metrics and are these well-defined and aligned across teams? Do individuals understand what metrics they can influence and how data science can help them improve their performance?

2. Democratise your data. Invite your data scientists to put tools in place that make it easy for everyone in the company to find, access and interpret the data they need for their job. Help them understand why data is their friend, not something to fear. People are often defensive: they think their jobs are going to become obsolete because of data science: that’s very rarely, if ever, the case. The reality is that this is going to help them become more effective at their job – but only if they understand how to use it. That requires education.

3. Take an iterative approach. View your data projects as experiments, not one-off transactions. Use agile methodologies and keep the focus on continuous improvement so you can keep refining the process to maximise its impact. What your data science team delivers the first time is rarely spot-on. It will fail in certain ways. Not taking the time to understand how it failed is a missed opportunity. Working fast to optimise the solution as you learn more would generate faster implementation with continuous improvement. It’s no different from when a company designs a new product. Imagine if Apple had stopped at the very first iPhone.

4. Hire a translator. You need someone in your business to liaise between those in the existing business organisation and the data science team. This person speaks the language of both and is able to act as a “translator”. They need to understand both the business imperatives and what data science can do - and cannot do. They must command the respect of both the business and the data scientists. Their main role is to help these two groups work effectively together to identify business opportunities, solving them in a way that can be implemented, and to foster a culture of continuous improvement and feedback. Your translator-manager is an expensive hire because there aren’t very many individuals they bring in two separate skillsets with little overlap.

If it’s set up well, the data science team can only help your organisation compete in the fast-changing and increasingly competitive landscape. It can also become the incubator for future leaders.

Why today’s the day for big data

Big data, machine learning and data science all use large amounts of data to do three things:

1. Provide detailed and evidence-based insights on what has happened in the past;
2. Make statistically reliable predications for the future; and
3. Make suggestions about what to do now to achieve a certain set objectives.

As customers expect ever-faster, cheaper and more personalised experiences, these methods are vital in delivering them. Furthermore, the existence of large datasets, cheaper computational power and better methods to analyse large datasets makes it much easier to implement at scale now than ever before. And if you don’t, your competitors and disruptors will.

For a fuller discussion of the limitations and reliability of algorithmic decision-making for leaders, see

Comments (0)