Explore our Multichannel lifecycle marketing Toolkit

17 ways to F**k-up your AB Testing

Author's avatar By Expert commentator 16 Sep, 2014
Essential Essential topic

Learning to avoid Dumb AB testing mistakes

bang-head-wallBack in 2005, I started my first forays into split testing of emails, landing pages and funnels whilst working at LOVEFiLM.  I'm nearly at my 10 year anniversary, and after seeing over 40 million visitors split tested with stuff, I've become rather battle hardened to all the curious, strange and just plain stupid ways I've managed to break my own experiments.

I'll be doing a talk this Wednesday at the Smart Insights Digital Marketing conference on how to improve structured testing to boost profits, so to introduce it, I wanted to write a little about one critical way we can all improve our AB and Multivariate testing.  So, in this article, I'll be exploring device and browser compatibility or 'Getting with what your customers have at their fingertips'.

AB testing your developers instead of your customers

Firstly, let me use an analogy and an explanation.  I once interviewed a developer and asked a question as part of the process - 'What's the best way to code your sites to get maximum reach and compatibility with what customers actually use?' After a long and rambling explanation, he finally finished with the statement 'Well if people want to use Internet Explorer 8, then they're just effing stupid'.

I then looked at his CV and asked a question, as I knew he had two young children - 'So if your wife and you take your double buggy for a shopping expedition and it won't fit through shop doorways - is that because you were too effing stupid to buy the right one then?  Nothing to do with the retailer?  All the customers fault then?'  Suffice to say we didn't hire him, but it's a good example of customer unfriendly bias towards what people use.

What happens at most companies is that there is an assumption that is never put to the test.  'We test all the popular browsers on our website'.  For many teams, this testing work is left to the developer,s yet they are often without the key knowledge to make the right call on what should be tested.

In a future article with Smart Insights, I'll be showing you (at least in Google Analytics) how to draw up a testing list which mirrors real customer devices and browsers.  For now, let's assume somebody has a list at your company.  Who drew this up?  When was it last updated?  Has it ever been updated?  How much of your customer base does your testing actually reach?  Who is responsible for this whole thing?  Sadly this is often something that drops through the cracks in an organisation, at least in terms of someone taking clear ownership and responsibility.

Why should I care about AB testing?

To be blunt, because this is costing you money (all of you and yes, you, reading this too).  And why can I make this claim?  Well let's look at some anecdotal evidence - in the last week, I've seen the following:

  • A parking permit and ticket site that failed to work in the most popular browser, Chrome!
  • An e-commerce fashion site which broke in Internet Explorer 8
  • An add to basket for an electronic cigarette company which failed in Safari
  • A picture carousel on a product page which didn't work in Chrome or Safari

These are pretty big issues aren't they?  At least one of these hits a big chunk of customers and simply prevents them from giving you money!  And so what, precisely, has this got to do with AB testing?

Well - it's simply down to the fact that if you have bugs like these on your site, they're probably biasing or breaking your AB testing too!

One company I worked with got 1.5% of their revenue each month from Internet Explorer 8.  They couldn't be bothered to do the testing or fixing I advised until I quantified the loss.

I explained that if they could make Internet Explorer 8 convert to checkout at anywhere near the rate of Internet Explorer 9, 10 or 11, this would rise to nearly 6% share of revenue.  For two days of work, 8 million pounds per year was the payback.

Why does it break things so badly?

Well - if you were working in a retail store and you could see people struggling to get through a doorway, you'd do something about it.  Let's assume you've got a bug on your site though that hits Safari users during checkout. This is dampening your sales but you haven't noticed it because customers haven't complained or your call centre hasn't passed this on.  It simply becomes part of the background average conversion on your site - because unlike the retail store owner - you might not be seeing the problem.

Clearly if you have these holes they're a massive opportunity if you can find them and fix them.  However, these holes aren't necessarily making you change your digital strategy in major ways - they're just you punching under your weight.

When you start doing AB testing though, these browser and device issues become magnified.

What happens during testing?

So unlike a sitewide browser or device compatibility issue, let's assume you're now running an AB test between two checkout systems you're trying out.  You put the test together, run some basic checking in your company standard browser and then start it going.

In the original checkout, it was just a really long form on one page, which was putting customers off.  You decided to test a new checkout, which nicely chunks each step into a separate page.  Your developers put this together with some really clever JavaScript and it looks slick.  You're confident the new design will win but it fails badly.  What happened?

During the test, the new version didn't work and you all scratched your heads.  'The new design should have won!' 'I can't believe the new one isn't beating the old one' and 'Well that's what the testing tells us - so we have to go with that'.

You then decide as a company to stop building the new design because, after all, the testing told you it didn't work.  But what if you were wrong?  What if something was broken in the test which biased the result?

I work with some clever developers and product people so I get to see a lot of AB tests being assembled and checked.  Roughly 40% of all the builds I've looked at in the last two years have had one or more bugs in particular browsers.

Now that's a fairly high failure rate and I like to think we're pretty experienced at doing this testing lark, so that convinced me I should always be doing QA on every experiment, without fail.

It simply means that if you aren't checking your AB tests work in all the devices (tablets, mobile, desktop, laptop) and browsers (Chrome, Internet Explorer, Firefox, Safari) that your customers use, your test is probably a pile of dog poo.

And it's not just the failure of the test that hits you.  You get limited takeoff slots for tests to run so making efficient use of every test is important.  Having a broken test is going to slow you right down and give you the wrong conclusion.

This is the biggest problem though - when you reach the wrong conclusion, you'll change your business strategy around that insight, even if it was completely wrong.  And that's a risk you shouldn't be taking.

So how do I fix this after AB testing?

There are three things you can do to

  • 1. Draw up a test list (I'll cover this in a later article)
  • 2.Test your devices and browsers
  • 3. Monitor your results

Making your test list

This is the subject of a longer article to come but in summary, you need to draw up a list of mobile devices, tablets and desktop/laptop browsers for your testing.  You may not be able to test everything that your customers use but you should be hitting 80+% of the devices and browsers that hit your site.

Testing mobile and tablet devices

For testing tablet and mobile devices, you should have a list of the key handsets (and versions) that are arriving at your site.  Most of these will be using the pre-installed browser they came with although you should always check the figures.

You have five choices when it comes to testing mobile and tablets - real devices, cloud devices, a device lab, Friends and Family or simulators.

Real devices

This is the best option.  If you can buy the mobile handsets or tablets that represent large chunks of your customer base, you can run tests in the office.  You can source unlocked handsets pretty cheaply on Amazon if you want to build your own device lab internally.

Cloud devices

This is the next best option.  Appthwack and Deviceanywhere both offer a unique service.  They install phones in a server cabinet and connect their screens and keyboards remotely to any location in the world.  You can basically 'remote control' a real phone and see what happens on the screen.

With these services, you can install apps, send text messages, browse the internet or test your AB test - using a real device on a real network.  It's the closest thing to having a mobile or tablet device in your hand.

Open device lab

These are located all round the world - generous people and companies donate their handsets so that anyone can book a device lab for testing or exploration.  You call up, book a time and then go use their devices for testing.  It costs nothing and is a good way to find a heap of devices in one place

Friends and Family

If you're stuck, you can often find friends, family or colleagues who have handsets or tablets that represent big chunks of your customer base.  If you can't find all the handsets you need then testing something is better than nothing.


I've used simulators with developers for testing before and it simply doesn't work.  When you're testing a site with lots of JavaScript, you really need the real device as you don't hit the same problems in a simulator.  It's fine using one of these to see how a page might render and that might be reasonably successful.  However, it's not useful to check all the interactions on a page and expect these to work flawlessly in the real world.

Testing desktop browsers

There are some excellent services for testing desktop browsers (like Internet Explorer, Safari, Firefox, and Google Chrome) and they're very reasonable:

Why am I bothering to test this stuff?

If you don't test your site or your AB tests (before they go live) using real devices or tools - then you're going to break something.  At the very least, your testing or analytics insights will be skewed and quite possibly, you'll make entirely the WRONG business or strategy decision from following a broken test result.

It's also a huge money opportunity - to make your site work for customer stuff - but when it comes to split testing, it could harm the work of the entire team and make people lose trust in your testing programme.

Key Takeways on AB Testing

If you want to get insightful AB tests or discover easy ways to squeeze extra money out of your site, then start doing device and browser testing.

The first step you might take today is to look at the conversion rate for your top browsers and devices coming to your site.  You might start asking questions about why there are such big differences and you might find a hole in your site or AB test too.

Hopefully this article has made you curious to find out if you're actually testing real stuff or just your developers’ ability to break things for customers.

Craig will explain more ways to avoid these mistakes at the Smart Insights conference on Wednesday 17th September.


Author's avatar

By Expert commentator

This is a post we've invited from a digital marketing specialist who has agreed to share their expertise, opinions and case studies. Their details are given at the end of the article.

Recommended Blog Posts