Semantic search is a data retrieval method where user intent and resources are represented in a semantic model. So what do you need to know about it?
Semantic search is not a new concept in the SEO industry but many digital marketers are still confused about what it is and what to do about it.
What is semantic search (also referred to as "semantic SEO") and how can it help improve organic search visibility?
Here's what digital marketers should know about semantic search.
What is semantic search?
Semantic search is a data retrieval method where user intent and resources are represented in a semantic model, i.e. a hierarchy among concepts as well relationships (connectivity) among concepts and entities (Source: Peter Mika, Senior Research Scientist at Yahoo Labs ).
Put simply, semantic search analysis aims at determining the intent of the user (i.e. what does this searcher mean to find?) and contextual meaning of the query.
Unlike lexical search (that matches web pages to a keyword string), semantic search is about matching pages based on the meaning and the context.
Semantic search examples
Here’s what a semantic-analysis-based search engine (i.e. Google) knows about "Game of Thrones":
- It is an entity (drama series, fantasy series, television series, etc.).
- It is based on George R.R. Martin's book.
- The book is called "A Song of Ice and Fire".
- The show writers are George R. R. Martin, David Benioff, D. B. Weiss, Bryan Cogman, Vanessa Taylor, Jane Espenson, Dave Hill.
- It is done by HBO (a television network).
- The cast include Emilia Clarke, Kit Harington, ...
- Characters include Tyrion Lannister, Daenerys Targaryen, ...
- It is about winter, kingdoms, ice, fire, bastards, ...
Obviously, Google knows much more but you get an idea. Besides, there's more to topic modeling than simply clustering the topic. There's also hierarchy helping Google understand how all those concepts and entities relate to one another.
All of these entities (names, locations, etc.) and related concepts (drama, episodes, TV series, winter, etc.) constitute Google's model of the topic and allow it to evaluate any page based on how much useful information is included.
Thanks to that topic modeling, Google is also able to guess entities even when you type seemingly generic terms into the search box.
To see semantic search in action, here's another example: if you were to search for [cat missing], Google would use semantic analysis to include concepts like.
- Related concepts (i.e. "pets" not just "cats"; "lost", not just "missing" as well as "pet shelters" and "animal control centers").
- Related areas.
- Other possible ways to help you out.
Some of those included concepts are synonyms, some of them are closely related terms, and a few of them are additional resources that may be helpful too (based on the searcher's intent).
Semantic search and Google
Google has a long history of trying to understand relevancy. Their initial way to do that (based on external citations) was ok at that time (better than anything else in the search industry at least) but it failed in the long run.
This method proved itself easily manipulated and Google has been struggling with the fake signals ever since.
The advances in machine learning and natural processing technologies have helped Google to come up with a new solution, i.e. semantic search that includes topic modeling and semantic connectivity.
Google Hummingbird was announced in 2013 marking an era of the new search where Google understands "things" (entities and concepts) and not "strings" (exact keyword sequences).
That's also the time Google really improved their Knowledge Graph (something they had been working on for years) to reflect a better "entity search structure" (a search that includes a name of a brand, location or an organization).
The launch was quite subtle but years after we are seeing this "things, not strings" concept any time we search. Search for something like [where was it filmed] and you'll see the exact query (i.e. the "string") nowhere in the above-the-fold area of the search results page.
Instead, Google knows "It" is a movie, and based on that knowledge the search engine even suggests questions that broaden the search beyond the initial query to help the user discover more movies in the same location (chances are, if you go to that place, you may be interested in more filming locations!).
In 2016 at SMX West Google engineer Paul Haahr was presenting on how Google works and his slide perfectly reflects the semantic analysis component.
Semantic search had been around long before Google Hummingbird. Back in 2003, Ramanathan Guha, the future "Google Fellow", creator of Google Custom Search and Google Schema project, co-authored a paper called Semantic Search.
The paper introduced a lot of concepts we would be talking a lot in later SEO days including navigational searches and "pages versus real-world objects" claiming that the semantic web is:
"Web of relations between resources denoting real world objects, i.e., objects such as people, places and events."
The paper also introduces an early version of Google's Knowledge graph way before we saw it in action.
Moreover, the document provides a glimpse into the actual semantic model any topic can be represented by.
What isn't semantic search?
Unlike many people think, semantic search is not about using synonyms in the page copy (which may be nonetheless useful to create solid content... unless you do that deliberately to please Google which seldom results in a good copy).
Semantic SEO is not "Latent Semantic Indexing (LSI)" because that concept goes back to 1980 and it's somewhat old-school now. All those tools that use math to calculate whether you are LSI-optimized will do more harm than good. It's not the way to write good content.
Machine learning technologies have long moved far beyond that concept, so let's just leave it behind.
How to optimize for semantic search?
Well, I am not a fan of the "semantic search optimization" term because you don't really "optimize for semantic search". You use "semantic analysis" to write better content. However our industry tends to "optimize" for everything, so I'll just let it be.
When it comes to semantic analysis, your own outlook and common sense are your first resources, of course. Simply write down what you already know about the topic and what it entails. To expand and structure your knowledge, here are a few useful tools:
1. Google: Search Results, Knowledge Graph and Google Suggest
Google is your best friend when it comes to any type of research, including semantic research. It gives you multiple clues as to what they know about a topic and what concepts and entities they apply to any search.
When searching Google for your target topic, look out for the following clues:
- Is there a knowledge graph included and if so, how is it structured? Knowledge Graph usually includes basic knowledge about an entity as well as related entities (similar businesses, actors, etc.).
- What terms appear in bold in search results? These usually are closely related concepts that can do as good of a job answering the query as the query itself.
- What is included in "People Also Ask" box? Try clicking those questions to expand more and more answers. This helps in understanding related questions and Google's model of the initial query.
- What shows up in Google Suggest results both before and after you perform the search. As a rule, prior to the search, Google will complete your query. Once you click "search", Google suggest will try and help you expand your query beyond your string which gives you a lot of clues as to how Google understands it.
2. Text Optimizer
Text Optimizer is the semantic analysis tool that uses Google's search snippets for your target query as the analysis source. The idea is that Google generates search snippets based on what they think serves the query best.
Hence taking this context and using semantic analysis to cluster it into concepts and entities is the most productive way to quickly understand the topic better. Here's what I got for [Game of Thrones] query using Text Optimizer.
Clicking any term allows you to see its immediate context.
Use the tips and the clues to make your editorial decisions and expand your context based on the tool's suggestions.
Dandelion is an entity extraction tool that allows you to better understand how Google is doing that too. Whenever you are doing your initial topic research, it is a good idea to copy-paste a comprehensive article on the topic (e.g. Wikipedia page) and run the tool for it to create and categorize the entities from it.
Here's how the tool structures "Game of Thrones" article: Notice how entities are categorized into "people", "organizations", "places" and more.
Semantic search is only part of the puzzle. There are many more important elements to Google search and rankings, including personalization, external citations, trust signals, and more.
Furthermore, semantic search is not about pleasing a machine or including related terms to force it into thinking your content is high-quality. Let's put these earlier-day SEO concepts of fake signals behind, once and for all.
Semantic analysis should be your incentive to provide better, more comprehensive and diverse content which will provide value and deserve to rank higher.