Google reveals what factors influence your sites 'crawl budget' and what you can do to boost it
Importance: [rating=4] For webmasters managing large sites (1000+ URLs)
Recommended Source: Google Webmasters blog
Your site has to be crawled by Google if it's going to rank for anything. That's SEO 101. If areas of your site are not getting crawled, it means they won't be appearing in any SERPs and are a total waste of time from an SEO perspective.
If you have a small site with a simple URL structure, you don't need to worry about crawl budget at all. But if your site has several thousand URLs, or if it auto-generate pages based on URL parameters, then you might want to look at how you can increase your crawl budget. This post will outline what crawl budget is and how you can improve it, which Google have just explained for the first time.
What is Crawl Budget?
Google has defined crawl budget as 'the number of URLs Googlebot can and wants to crawl'. If you don't know much about crawling and Googlebot then this definition isn't really very helpful! So let's break this down into 'can' and 'want'. The 'can' dimension of the crawl budget is known as the 'crawl rate limit', and it's fairly self-explanatory.
Crawl rate limit
Essentially, the crawl rate is how fast Google will crawl your pages and is defined by how fast your site is responding when Googlebot is crawling your pages. If the site is slow or has server errors Googlebot crawls less. If the site is responding quickly, Googlebot will use more connections to crawl the site, and the pages will be crawled faster. So to boost your crawl rate limit you just need to make sure your site is responding quickly. This sets the 'can' element of Google definition of crawl budget.
Crawl demand is how much Google 'wants' to crawl your pages. Google has more incentive to keep popular pages that get lots of visitors up to date in the SERPs and Googlebot will prioritise URLs that are getting lots of traffic. Google also doesn't want any pages to be wildly out of date, so has also announced 'staleness' as a factor in crawl demand - that is pages that have not been crawled in a long time will be prioritised to be crawled.
Other factors affecting crawl budget
Google have also announced a series of other factors that will also affect your sites crawl budget. They have said that having many 'low value' URLs will slow down crawling and indexing, and they've identified the following as being 'low value' URLs that could have this effect.
- Faceted navigation and session identifiers
- On-site duplicate content
- Soft error pages
- Hacked pages
- Infinite spaces and proxies
- Low quality and spam content
Will boosting my crawl budget make me rank higher?
Crawl rate doesn't effect rankings, so provided all your pages are being crawled without issue at the moment, a boost in your crawl budget won't affect your position in SERPs. However if parts of your site currently aren't being crawled regularly because of issues with your crawl budget, then these parts won't be appearing in SERPs. Therefore by raising your crawl budget by fixing/removing the 'low value' pages identified above, more of your content will be appearing in SERPs and you should see your traffic increase.