What data science taught us about 211 email subject line phrases
Once you click the send button, your email subject line is the most important part of your campaign. This is a fact.
Think about your own experience in the inbox. Mine is something like this:
- Look at inbox
- Look at sender name
- Look at subject line
- Make a decision whether or not to open it and read the content
Your sender name is important as it conveys your brand value. But it’s not something you can change easily.
Your subject line is important because testing and optimising them drives response rates like nothing else.
But, there’s a common problem with subject lines.
Dotmailer research reveals 26% of email marketer’s time is spent on email creative, 21% on deployment, and only 8% on testing. And, over 70% of email marketers don't split test their subject lines very often.
Here’s the thing: if people don’t open your email, there’s a 0% chance they’ll see your creative.
It’s clear that more attention needs to be paid to subject lines. However, subject lines as a discipline lacks rigorous scientific and statistical investigation.
Research to reveal the importance of 'subject lines' in emails
I believe that there is a science to subject lines. So I did the research to prove it.
I took a random, anonymised sample of 700 million emails from phrasee.co, the subject line tester and generator. The data is from a collection of online retailers spanning numerous industries. It is mostly from the UK and the USA, with small amounts of data from other Anglophonic countries.
Simply looking at past data is myopic and risky. It assumes that tomorrow will be the same as yesterday. This is where most subject line analysis falls short.
I then applied some advanced statistics. I’ll spare you the gory details but, in a nutshell, I created a time-decayed Bayesian probability model. Then, I ran billions of Markov chain Monte Carlo simulations to generate a huge amount predictive data. This allowed me to take a data set and extrapolate it to a much larger sample of outcomes. It also controlled for random variance, which makes the end results more robust. It also takes into account interdependence of phrases, not just binary outcomes.
From here, I normalised the results and calculated a quality score for individual phrases. The quality score is derived from a combination of response metrics, time-decayed results, and external factors. It’s called the Phrasee Score™.
What is a Phrasee Score™?
Phrasee Scores™ are statistics that tell you how well (or poorly) a given phrase will perform in an email subject line. They are based upon our training data and predictive algorithms.
Scores range from 1 to 100. The higher the score the more reliably a phrase drives response.
Bear this in mind though: a high Phrasee Score™ doesn’t tell you that the given phrase will always deliver solid results for you. What it does tell you is that a phrase has above-average results with low variability. A low score may have good results at times, but the variability is higher. It’s therefore less likely to be the causal variable of increased response.
Phrasee Scores™ are an indication of quality. They will help you decide what to use in your subject lines. But they’re not silver bullets – there’s a lot more to subject line science than that. To understand more about how Phrasee uses this score to generate subject lines, then read on.
Results of the analysis
I tested out 211 common phrases used in email subject lines that are focused on selling.
For the sake of keeping this blog post under a million words, I’ll abridge the results. You can download the email subject lines report.
Key findings from the subject line analysis report
I’ll break it up into a few subsections to aide readability and comprehension. I’ll show the top 5 and bottom 5 from a selection of the categories I analysed.
These are call-to-action phrases that are intended to elicit a specific behaviour.
These are subject lines formed as a question, checking specific inquisitive structures.
These are phrases that relate to a specific offer, discount or sale.
These are noun or verb modifiers that elicit emotional response from email recipients
These are phrases that use time or stock defined scarcity as an action driver. Clearly, anything to do with Midnight isn’t working too well.
I analysed much more as well: In total I analysed 211 phrases. Beyond the Phrasee Score™ and also calculated open, click and click-to-open means and quartile metrics. You can download the full email subject line report
Key email marketing takeaways
From doing this analysis, I learned the following things:
- There is a science to subject lines
- Testing subject lines is key to learning what works
- Advanced statistics can teach you a lot about subject lines
- Don’t use the word 'midnight'
What are your experiences?
Phrasee uses machine learning to optimise your email subject lines. free trial of phrasee.co and benefit from subject line science in action.
Thanks to Parry Malm for sharing his advice and opinions in this post. Parry is CEO of Phrasee, and President of agency Howling Mad. He has worked with countless brands and media outlets to help them optimise their online results. He’s one of the world’s leading experts on email marketing. He started his career coding middleware for CRM software, then sent out millions of emails for global brands, before running the strategy department for an ESP. He holds a BBA (1st) in Marketing & Statistics and can probably beat you in an Excel-off. On weekends, he helps wayward youths see the error of their ways through the magic of interpretive dance. You can follow the company on Twitter or LinkedIn.