Why We Limit Split-Testing Slots

20th Oct 2017

Testing fewer ideas at a time, and doing that repeatedly, is a more efficient way to test multiple, mutually exclusive ideas than trying to compare them all at once. Tests will pass faster, so you can make successive marginal gains rather than wait around for one big pay-off, while losing money on prolonged bad variants. Furthermore, by iterating through shorter tests you can come up with better ideas by frequently reflecting on your results.

What type of testing are we talking about here?

Amigo uses A/B testing to inform marketers about the performance of their campaigns. Equal traffic is sent to a control and a variant (or a champion to a challenger, as they are sometimes called). A winner is declared once the results converge sufficiently that we can be confident of the answer. We define ‘sufficiently’ using Bayesian statistical inference and the expected value of the loss function. This is the fastest reliable way to determine whether a new idea is better than the status quo.

Amigo also conducts A/B/n tests, where a control is compared to more than one variant. However, we strongly advise that a limited number of variants are tested at once. While marketing teams may come up with tens, even hundreds, of ideas, they ought to be tested in sequential pairs or triplets rather than entirely simultaneously.

Why test like this?

Bayesian A/B testing is all about confidence. The more you know about something, the more confident you can be about it. The more data you have about a variant in your split test, the more confident you can be about its conversion rate. As your confidence about the conversion rates grows, so will your confidence about the difference between them. Eventually, you can confidently declare a winner (or a tie) and end the test.

Why test fewer variants?

Split tests converge faster when they contain fewer variants. When you send a higher proportion of your traffic to each variant, you collect more data about them, which means you become more confident about their performance.

A/B/n tests with too many variants take too long to pass. Each variant receives less traffic and therefore generates less data. You will eventually collect enough data to end the test, but you will have kept losing variants in play for an unnecessarily long time. On the other hand, by running a sequence of tests on pairs or triples of variants you can make marginal gains as you go about searching for the best variant.

By testing a handful of variants each time, you will discover the best variant at least as quickly as if you had tried to test all your ideas at once. In fact, depending on how many variants you want to try, you are very likely to discover the best one much more quickly.

Why is it better to work like this?

Iterating through smaller tests like this also promotes a better way of working. By engaging with the data more frequently, you can learn from your past hypotheses and generate better ones. The ideas you come up with after a few tests are very likely to be better ideas than the ones you come up with at the start of your testing process.

This is one reason why we opt for A/B and A/B/n testing ahead of other options. The problem of how to test your marketing ideas in a way that minimises losses is often understood as a tension between exploration and exploitation. You can balance these two aims with A/B testing but there are other approaches, such as an epsilon-greedy bandit algorithm.

(This set of algorithms are referred to as ‘bandit algorithms’ because the exploration/exploitation problem was first explained by a statistician by analogy of a gambler faced by multiple fruit machines with different probabilities.)

Sophisticated bandit algorithms can be too complex for most marketing experiments but they also suffer from another, perhaps more important, flaw. The advantage of an epsilon-greedy bandit algorithm for example is that it will allocate traffic dynamically to the variant that it currently assumes to be the best, while reserving a small % of traffic for testing the others. This means marketers can “set it and forget it.”

This is the opposite of agile marketing. We don’t want to encourage marketers to come up with a list of ideas, throw them all into the algorithm, and see what sticks. We want marketers to come up with better ideas, by testing the best ideas they currently have and honing their thinking according to the results. An agile method requires iteration and it is preferable because the ideas that you can come up with ten tests down the line will always be better than any you could have come up with at the beginning.

fruit machine

Find out more about how Amigo enables marketers to increase profits faster by being agile and closing the marketing execution gap.

Further reading

 

020 3940 4650