Technical SEO

Crawl Budget

Crawl budget is the number of pages Google will crawl on your site in a given period. It's determined by crawl rate limit and crawl demand.

Key Takeaway

Crawl budget is the number of pages Google will crawl on your site in a given period.

Why crawl budget matters for SaaS

Large sites with many pages can hit crawl budget limits—meaning important pages might not get crawled and indexed. Understanding crawl budget helps prioritize which pages need canonicals or consolidation.

How tracerHQ measures crawl budget

tracerHQ analyzes your crawl patterns and identifies pages that might be wasting crawl budget—thin content, near-duplicates, or low-value URLs that could be consolidated.

Crawl Budget in depth

Crawl budget is the number of URLs Googlebot will fetch from your site in a given time window, determined by two factors: crawl rate limit (how fast your server can respond without degrading) and crawl demand (how much Google wants to crawl your content based on importance, freshness, and change frequency). Crawl budget only matters for large sites (typically 10k+ URLs). Small sites will be crawled in full regardless. For large sites, wasted crawl budget on low-value URLs (faceted navigation, session IDs, infinite scroll variants) means important pages get crawled less often and new content takes longer to appear in the index. The solution is usually a combination of canonicalization, robots.txt disallows, and noindex directives.

Examples in practice

An ecommerce site with 500k faceted URLs sees Googlebot spending 80% of its crawl budget on filter combinations that all canonicalize to the main category page. Blocking filters in robots.txt redirects crawl toward product pages.

A SaaS notices new blog posts take 3 weeks to index. Log analysis shows Googlebot is stuck crawling thousands of old paginated archive URLs, which consumes most of the daily crawl budget.

An agency audits a site with 1M URLs in the sitemap and only 120k indexed; the fix was to remove 800k low-value tag pages from the sitemap and add noindex.

Common mistakes

Worrying about crawl budget on a site with fewer than 10,000 URLs.
Blocking pages in robots.txt instead of noindex, which prevents Google from seeing the noindex.
Assuming every URL in the sitemap gets crawled equally; Google prioritizes by importance.
Not monitoring server response times; slow responses reduce the crawl rate limit and compound the problem.

Related terms

Indexation →Canonical Tag →sitemap →

Track crawl budget in your dashboard

Connect Google Search Console and start seeing your metrics by keyword.

Start Free Trial