Google Explains the Mysterious ‘Crawl Budget’

All SEOs and many digital marketing professionals may be aware of the term ‘crawl budget’. Well, whoever you are – if you’re aware of it – you’ll know it is a mysterious label.

Google, in the latest episode of their SEO Mythbusting video series, explains what ‘crawl budget’ actually means. And, it’s very insightful.

The video is linked below, but in this blog, I’ll break down my key takeaways and timestamps to save you some time.

Google’s Martin Splitt answers over a dozen questions that are frequently asked regarding ‘crawl budget’…

1 – What is crawl budget and demand? (1:15)

Crawl budget is the balance between crawling as much content on a site as quickly as possible without overwhelming the servers.

The budget is number of requests Googlebot can make at the same time without overwhelming the server.

This refers to how often Google wants to crawl a site based on its subject matter.

For example, a breaking news site will likely have a higher crawl demand compared to a recipe site.

2 – How does Googlebot make its decisions? (2:44)

Google determines how often to crawl a page based on how frequently the content changes. If the frequency of change is low then the site will not be crawled as often as others.

3 – How often content is crawled (3:43)

Google uses things like ETags, HTTP headers, and last modified dates to determine how often content should be crawled.

An ETag is a caching header that contains a fingerprint of the content to detect changes over time.

4 – Servers and crawl budgets (5:00)

Crawl budget is frequently cited as a problem for site owners when the underlying issue is usually server setup or quality of content.

5 – Quality of content (6:18)

It’s not an indication that content is high quality when it is more frequently crawled. It’s also not an indication that content is low quality when it is less frequently crawled.

6 – Getting your site crawled accurately during a site migration (8:18)

Martin Splitt recommends progressively updating your sitemap noting what has changed and when. Beyond that, try to ensure both servers are running as smoothly as possible.

7 – Caching of resources (11:46)

Google caches resources as aggressively as possible to avoid having to recrawl them every time.

8 – Crawl budget and specific industries (13:34)

E-commerce sites and large publishers should be most concerned about crawl budget.

9 – Can you get Googlebot to crawl your site more often? (17:40)

No, this cannot be done. Site owners can limit how often Googlebot crawls, but there’s no way to trigger Googlebot to crawl more frequently.

See the full video here:

Latest

Latest News & Blogs