Experimentation - the hidden layer in the data science pyramid


Why experimentation is required for a good return on your data investment.

When you think about data science, what springs to mind? Graphs and charts in a presentation? A data scientist building models? Perhaps companies such as Google, Facebook or Netflix that have built their businesses with “data science inside”?

These are the obvious faces of data science, but within businesses built on a foundation of data, there are multiple levels of capability that must exist to drive the delivery of compelling digital products and services.

There are many ways to describe this, but here I’ll break this down into five major levels that support each other. Each layer has aspects that are supported by technology, people and culture. One of those layers, experimentation, is continually overlooked, but is key to driving positive outcomes from the rest.

Modelling and Analytics

Lets start with the obvious - modelling and analytics - the purpose of which is to understand or predict the behaviour of systems or people. Capability in this area consists of a combination of people and tooling. Building a strong data science team, which consists of a broad range of maths, computer science and business skills, is key. Tooling can be acquired or built to support these teams, to make them more efficient and reduce the labour intensive parts of their work. Tooling cannot replace them, however, as a key part of their function is understanding customers and communicating with the business, and there are important parts of the value chain for analytics that cannot yet be automated, as will be discussed below.

Data Enablement

The ability to do modelling and analytics relies on a solid foundation of data, and data enablement lies as a layer below modelling in our pyramid. This is an enormous area with great complexities, but key concerns for the support of analytics are

  • data quality – the famous garbage in = garbage out principal
  • data and metadata availability – first for discovery, then reliable and performant access
  • management – appropriate processes that protect customers, employees and the company, while maintaining efficiency

Production Deployment

If these first two layers are in place, then businesses can create models and do reliable analytics, but cannot yet deploy them for effective customer or business interventions. To do this, they need production deployment capabilities which can take the outputs of modelling and analytics and make them reliable and capable of being delivered at scale. These capabilities include moving models and analytics from the lab into production, data transport, model versioning, scalable deployment, monitoring, and delivery. This area has undergone a modest revolution in the last few years, with many new vendors recognising the importance of the capability (H2O, data robot etc), and also a large number of open source technologies being driven from the major digital players on top of Kubernetes and Docker – examples are Kubeflow, MLFlow, Istio, and Seldon which form the basis of modern MLOps approaches.

ML & AI Ethics

There has also been increasing concern over the last few years in the ethics of artificial intelligence and machine learning. This has been driven by a desire to stop the undesirable outcomes that have resulted from some machine learning systems, and to ensure that the widespread adoption of AI in industry and government doesn’t lead to damage to society more broadly. As we use AI in more and more applications, we need to ask the question “should we do this?” more, rather than “can we do this?”. Once we have decided we should do something, then we need to ask “how do we do this safely”? Practically, this has led to companies adopting AI ethics frameworks to guide decision makers and practioners in understanding the questions of “should” and “how”. The development of a culture of ethical AI within a data driven organisation is a fundamental requirement for data driven companies of the future.

Ambiata’s parent company, IAG, has sponsored the Gradient Institute, a not-for-profit research company focusing on practical solutions for AI ethics and we work closely with them to ensure that our systems adhere to the ethical standards we choose.

The Hidden Layer - Controlled Experimentation

Even if you have all the other elements of the data science stack in place, how do you learn about your customers? How do you know if what you are doing is working? This is where experimentation comes in. There are many types of experiments - here we are talking about “controlled experiments”, where a numerical comparison between a set of people having one experience or intervention is made with another receiving a different experience, exactly has been practiced for years in medical randomised controlled trials.

Digital natives have been doing this at scale for years and have very established procedures and methodologies for experimentation - companies boast that they run hundreds or thousands of experiments concurrently and optimise their customer experience directly based on the feedback from those experiments.

There are three ways that experimentation interacts with the use of data science in business.

Learning about customers

It is now easy to build a model to predict certain aspects of customer behaviour. However, these models can be driven by correlation rather than causation - things that have a common cause, or just randomly happen together in the historical data, are used as a signal for the model in prediction. Within an experimental framework, cause and effect can be isolated from spurious correlation - this means the models represent an understanding of how the system works, and can be interpreted with respect to cause and effect with high reliability.

When can you say you have learnt something about your customers? It is when you can predict the overall outcomes of something you’ve never tried on them before. The ability to generalise is a key aspect of understanding - if your data science results can’t be used to generalise, you haven’t really learnt anything at all. All new business ideas are generated by a spark of creativity, driven to production through innovation, then live and die through the response of the market. Those creative sparks are generated from an internal understanding of how the system works - what is the value being provided, and how can it be realised. These assumptions must be checked to see if they are true generalisations or false ones.

Does my AI and machine learning provide value?

The most common form of testing in a digital company is A/B testing - the process of comparing two approaches with randomly selected customers. A/B tests allow organisations to determine when one approach is better than another, and to understand what is driving outcomes. Machine learning and AI systems also need to be tested in this way. For instance, an AI system that prompts a customer or user to do something they would have done anyway is not providing value. In other cases, an AI system may provide far more value than a rule based system - but without an experiment, it will be unclear whether the improved value is really due to the AI or was due to a change in the market.

If you really want to know what works - run an experiment.

Am I doing what I think I’m doing?

Finally, experimentation is tightly linked with ethics. Consider the advanced state of ethics related to medical trials - there are elaborate structures to govern the ethical use of experiments in medicine, because the inappropiate use of experimental methods can lead to great harm - if a trial is particularly good or particularly bad, then the trial needs to be halted - either to give a good treatment to more people, or to stop a bad treatment being given to anyone. Similar techniques must be adopted in experimentation in digital domains, particularly when there is potential harm for end users.

Proceeding without experimentation in digital business is as crazy as proceeding without experimentation in medicine would be. Don’t do it.


We’ve broken down the effective use of AI and machine learning in an organisation into 5 layers:

  • Data enablement
  • Modelling and Analytics
  • Production Deployment
  • Controlled Experimentation
  • AI and ML ethics

Every AI strategy needs these elements, but often the need for an experimentation framework is overlooked within a business. This is ironic, as it is the piece of the picture that represents the “science” in data science and has driven a huge increase in value in the largest digital businesses.

More recently, major digital companies such as Facebook, Google, LinkedIn, and Netflix, have been exploring the convergence of experimentation and personalisation. These systems close the loop on human behaviour - retraining AI and machine learning systems so that they learn and adapt from every interaction they have. Within these systems, constant experiments are running - every potential action is explored with a fraction of the population. This has enormous advantages when it comes to ensure that your experiments are efficient and maximimising the user’s experience of the service being provided.

This convergence of personalisation and experimentation through closed loop machine learning, such as enabled by our product Atmosphere, will be the subject of a future blog.