Explore our Email Marketing and Marketing Automation Toolkit

Which words in Email Subject lines drive the best response?

Author's avatar By Expert commentator 02 May, 2017
Advanced Advanced topic

What data science taught us about 211 email subject line phrases

Once you click the send button, your email subject line is the most important part of your campaign. This is a fact.

Think about your own experience in the inbox. Mine is something like this:

  1. Look at inbox
  2. Look at sender name
  3. Look at subject line
  4. Make a decision whether or not to open it and read the content

Your sender name is important as it conveys your brand value. But it’s not something you can change easily.

Your subject line is important because testing and optimising them drives response rates like nothing else.

But, there’s a common problem with subject lines.

Dotmailer research reveals 26% of email marketer’s time is spent on email creative, 21% on deployment, and only 8% on testing. And, over 70% of email marketers don't split test their subject lines very often.

Here’s the thing: if people don’t open your email, there’s a 0% chance they’ll see your creative.

It’s clear that more attention needs to be paid to subject lines. However, subject lines as a discipline lacks rigorous scientific and statistical investigation.

Research to reveal the importance of  'subject lines' in emails

I believe that there is a science to subject lines. So I did the research to prove it.


I took a random, anonymised sample of 700 million emails from phrasee.co, the subject line tester and generator. The data is from a collection of online retailers spanning numerous industries. It is mostly from the UK and the USA, with small amounts of data from other Anglophonic countries.

Simply looking at past data is myopic and risky. It assumes that tomorrow will be the same as yesterday. This is where most subject line analysis falls short.

I then applied some advanced statistics. I’ll spare you the gory details but, in a nutshell, I created a time-decayed Bayesian probability model. Then, I ran billions of Markov chain Monte Carlo simulations to generate a huge amount predictive data. This allowed me to take a data set and extrapolate it to a much larger sample of outcomes. It also controlled for random variance, which makes the end results more robust. It also takes into account interdependence of phrases, not just binary outcomes.

From here, I normalised the results and calculated a quality score for individual phrases. The quality score is derived from a combination of response metrics, time-decayed results, and external factors. It’s called the Phrasee Score™.

What is a Phrasee Score™?

Phrasee Scores™ are statistics that tell you how well (or poorly) a given phrase will perform in an email subject line. They are based upon our training data and predictive algorithms.

Scores range from 1 to 100. The higher the score the more reliably a phrase drives response.

Bear this in mind though: a high Phrasee Score™ doesn’t tell you that the given phrase will always deliver solid results for you. What it does tell you is that a phrase has above-average results with low variability. A low score may have good results at times, but the variability is higher. It’s therefore less likely to be the causal variable of increased response.

Phrasee Scores™ are an indication of quality. They will help you decide what to use in your subject lines. But they’re not silver bullets – there’s a lot more to subject line science than that. To understand more about how Phrasee uses this score to generate subject lines, then read on.

Results of the analysis

I tested out 211 common phrases used in email subject lines that are focused on selling.

For the sake of keeping this blog post under a million words, I’ll abridge the results. You can download the email subject lines report.

Key findings from the subject line analysis report

I’ll break it up into a few subsections to aide readability and comprehension. I’ll show the top 5 and bottom 5 from a selection of the categories I analysed.

  • Action words

These are call-to-action phrases that are intended to elicit a specific behaviour.

cta words

  • Questions

These are subject lines formed as a question, checking specific inquisitive structures.

Subject lines as question - scored research

  • Sale phrases

These are phrases that relate to a specific offer, discount or sale.

sales phrase - scored research

  • Superlatives

These are noun or verb modifiers that elicit emotional response from email recipients

superlatives words scored research

  • Urgency

These are phrases that use time or stock defined scarcity as an action driver. Clearly, anything to do with Midnight isn’t working too well.

urgency words - scored research

I analysed much more as well: In total I analysed 211 phrases. Beyond the Phrasee Score™  and also calculated open, click and click-to-open means and quartile metrics. You can download the full email subject line report

Key email marketing takeaways

From doing this analysis, I learned the following things:

  1. There is a science to subject lines
  2. Testing subject lines is key to learning what works
  3. Advanced statistics can teach you a lot about subject lines
  4. Don’t use the word 'midnight'

What are your experiences?

Phrasee uses machine learning to optimise your email subject lines. free trial of phrasee.co and benefit from subject line science in action.

Author's avatar

By Expert commentator

This is a post we've invited from a digital marketing specialist who has agreed to share their expertise, opinions and case studies. Their details are given at the end of the article.

This blog post has been tagged with:

Subject lines

Recommended Blog Posts