Skip to main content
Practical guide

How to Conduct a Technical SEO Audit: Step-by-Step Guide

The most expensive SEO mistake: auditing badly

Six months ago, a European retail site with 12,000 products commissioned an SEO audit. The report ran 47 pages. It flagged 312 errors. It included Screaming Frog screenshots, PageSpeed charts, and a colour-coded priority matrix. The team implemented all 312 fixes in the order they appeared, spending three months of development time on low-priority warnings while ignoring the single issue that actually mattered: 4,200 product pages with canonicals pointing to session-parameter URLs, creating massive duplicate content. Organic traffic kept falling.

This pattern repeats across businesses of every size. SEO audits do not fail because of insufficient tools or limited technical knowledge. They fail because of missing methodology. Running Screaming Frog and exporting a CSV of errors is not an audit; it is an inventory. The gap between the two is precisely what separates sites that improve after an audit from those that remain stuck.

A proper technical SEO audit follows a five-phase process where each phase feeds the next. The goal is not to find every possible error but to find the errors that matter, understand their business impact, and prioritise fixes based on a criterion that no tool can automate: judgement about what most affects the site’s organic performance.

This guide walks through that process phase by phase, with the specific tools you need at each stage, the data you should extract, and how to turn technical findings into an action plan your development team can execute in order of impact.

Why you need a technical SEO audit (and when to do one)

The most common reason for postponing an audit is that “the site works fine.” In many cases, that is true: pages load, Google indexes them, and organic traffic has not dropped dramatically. But a site’s technical health is not binary. It does not switch from “working” to “broken” overnight. It deteriorates progressively, silently, until an algorithm update or an accumulation of technical debt causes a decline that seems sudden but has been building for months.

The symptoms that justify an immediate technical audit are specific. If Google Search Console shows more than 15% of your URLs in “Discovered - currently not indexed” status, you have a crawl budget or perceived quality problem. If your field Core Web Vitals (CrUX data) show more than 25% of experiences rated “poor” for LCP or INP, you are losing positions to competitors who have optimised them. If after a CMS migration or redesign you have noticed a traffic drop greater than 10% that has not recovered within 4 weeks, there are migration errors that went undetected.

According to Ahrefs data published in 2025, 66% of websites have at least one technical issue preventing correct indexation of part of their content. The study analysed over 100,000 domains and found the most frequent problems were: pages with server response times above 1 second (42%), redirect chains with more than 2 hops (38%), and orphan pages with no internal link pointing to them (27%).

Beyond the numbers, an audit serves a function that no automated monitoring can replace: it provides a complete snapshot of the site’s technical state at a given point, with the relationships between problems made visible. An isolated canonical error is minor. That same canonical error multiplied across 4,000 product pages, combined with a sitemap including non-canonical URLs and internal linking pointing to the version without a trailing slash, is an architectural problem visible only when you cross-reference data from all five audit phases.

The recommended frequency is quarterly for sites with active publishing and semi-annually for more static sites. After any migration, the audit should happen in the first week, not when symptoms appear.

The minimum tools you need to audit your site

Before diving into the audit phases, a critical point: you do not need every tool on the market. You need the right tools for each phase and a clear understanding of exactly what data to extract from each one. Analysis paralysis from too many tools is as damaging as having too few.

The minimum viable stack for a complete technical audit has three layers. The first layer is free and covers 60% of the diagnosis: Google Search Console provides real indexation data, coverage reports, search performance, and field Core Web Vitals. There is no substitute for this data because it comes directly from Google. PageSpeed Insights complements it with page-by-page performance analysis, using both lab data (Lighthouse) and field data (Chrome User Experience Report).

The second layer is the crawler, and the de facto standard tool is Screaming Frog SEO Spider. Its free version crawls up to 500 URLs and is sufficient for small sites. The paid version (GBP 209/year) removes that limitation and adds critical features such as Google Analytics integration, custom extraction, and structured data validation. Alternatives like Sitebulb offer more visual reports but with less granular control.

The third layer is optional but valuable for complex sites: an analysis platform that integrates crawl data with ranking data. Semrush Site Audit, Ahrefs Site Audit, and Moz Pro offer this combination. Their advantage lies not in crawling itself (Screaming Frog is superior for manual auditing) but in continuous monitoring and automatic alerts between audits.

A common mistake is relying exclusively on continuous monitoring platforms and skipping the manual crawler. All-in-one platforms crawl at limited depth and frequency by design. For a real technical audit, a controlled crawl with Screaming Frog — configuring user-agent, speed, exclusion rules, and custom extraction — is irreplaceable.

For a detailed comparison of these tools, see our resource on SEO audit tools.

Phase 1: Complete site crawl

The crawl is the foundation on which the entire audit is built. Without a thorough, well-configured crawl, any subsequent analysis starts from incomplete data. The crawling phase is not simply “clicking Start in Screaming Frog”; it requires prior configuration that determines the quality of results.

Crawl configuration

Before starting, set the user-agent to Googlebot Desktop (to detect potential cloaking or conditional content issues), establish a crawl speed that will not overload the server (2-5 URLs per second for shared hosting, 10-20 for dedicated servers), and define exclusion rules for URLs that add no value to the analysis (UTM tracking parameters, staging URLs, static files).

Data to extract from the crawl

The crawl should produce a complete URL inventory with HTTP response codes, title and meta description, relevant HTTP headers (canonical, hreflang, X-Robots-Tag), crawl depth (clicks from the homepage), HTML size, and server response time (TTFB). This raw data is the material you will work with in subsequent phases.

Problem detection in this phase

The crawl directly reveals 4xx and 5xx errors, redirects (301, 302) and redirect chains, pages with slow responses (TTFB above 500 ms), duplicate URLs with identical or near-identical content, and the actual depth of the site (how many clicks separate the deepest pages from the homepage). According to Moz, 90% of crawlability issues are detected in this first phase when the crawl is properly configured.

Cross-referencing with server logs

If you have access to server logs, cross-reference the URLs crawled by Screaming Frog with the URLs Googlebot has actually visited in the past 90 days. This comparison reveals two types of problems: pages that Screaming Frog finds but Googlebot does not visit (possible prioritisation issues) and pages that Googlebot visits repeatedly despite not appearing in the main navigation (possible crawl traps from parameterised URLs).

The output of this phase should be a structured file (Excel or CSV) with all URLs, their response codes, and key metrics. This file becomes the database for the entire audit and will be referenced in every subsequent phase.

Phase 2: Indexation and coverage analysis

With the crawl complete, the second phase cross-references crawl data with Google Search Console data to understand what Google actually sees of your site. The gap between what your crawler finds and what Google has indexed is frequently where the most serious problems hide.

Google Search Console data

The Coverage report (now called “Pages” in the updated interface) classifies your URLs into four states: valid indexed, valid not indexed, with errors, and excluded. Each state has subtypes indicating the specific reason. The most relevant subtypes for the audit are “Discovered - currently not indexed” (Google knows the URL but has not crawled it), “Crawled - currently not indexed” (Google crawled it but decided not to index it), and “Duplicate without user-selected canonical” (Google detected duplicates and chose its own canonical).

Data cross-reference

Compare the total URLs crawled by Screaming Frog with the total indexed URLs according to GSC. If there is a difference greater than 20%, investigate the URLs present in your crawl but absent from the index. Filter by type: are they product pages? Category pages? Blog posts? The pattern will indicate whether the problem is crawl budget, content quality, or contradictory technical signals.

Canonicals and duplicates

This phase is where canonical problems become visible. Export from Screaming Frog all URLs with their declared canonical and compare: does the canonical URL match the actual URL? Are there pages with self-referencing canonicals pointing to parameterised URLs? Are there canonicals pointing to pages that return 404? Each of these discrepancies creates confusion for Googlebot and dilutes the authority of the correct page.

Sitemaps vs. reality

Download your XML sitemap and compare its URLs with those from the crawl and Google’s index. A well-maintained sitemap should only include URLs that return 200, have self-referencing canonicals, and that you want Google to index. The most common discrepancies are: URLs in the sitemap returning 301 (should be updated to the final destination), URLs in the sitemap with canonicals pointing to another page (contradictory signal), and indexed URLs not in the sitemap (possible orphan pages that Google found through other means).

The output of this phase is a clear diagnosis of the gap between the site you have and the site Google knows, with causes identified for each discrepancy.

Phase 3: Core Web Vitals and performance evaluation

The third phase focuses on how users (and Google) experience the site’s speed and stability. Core Web Vitals have been confirmed ranking signals since 2021, and data shows their impact extends beyond positioning: according to Web.dev, improving LCP from 4 seconds to under 2.5 seconds can increase conversions by 15-25%.

Field data vs. lab data

This distinction is fundamental, and many audits overlook it. Field data (Chrome User Experience Report, accessible via GSC and PageSpeed Insights) reflects the real experience of users with real connections and devices. Lab data (Lighthouse) consists of simulations under controlled conditions. For the audit, field data is what Google uses for ranking; lab data is useful for diagnosing specific causes.

LCP (Largest Contentful Paint)

Analyse the pages with the worst LCP in field data. The most common causes of slow LCP are: unoptimised hero images (format, size, incorrect lazy loading of the LCP element), render-blocking web fonts (missing font-display: swap or preload), render-blocking CSS and JavaScript (large files without code splitting), and slow server response time (high TTFB from uncached database queries).

INP (Interaction to Next Paint)

INP replaced FID in March 2024 and measures the latency of all user interactions, not just the first one. INP problems typically originate from heavy JavaScript blocking the main thread: event handlers with complex logic, client-side framework hydration, and third-party scripts (analytics, chat widgets, advertising) competing for the main thread.

CLS (Cumulative Layout Shift)

CLS measures how much visible content shifts while the page loads. The most frequent causes are: images and videos without explicit dimensions (width/height in HTML), dynamically inserted ads or banners that push content down, web fonts causing a reflow when replacing the system font, and JavaScript-injected content above the viewport.

Prioritisation by impact

Not all pages deserve the same level of optimisation. Prioritise pages with the most organic traffic, most conversions, or greatest ranking potential. A product page with 10,000 monthly visits and a 5-second LCP is more urgent than a privacy policy page with a 4-second LCP.

The fourth phase evaluates the site’s structure as an interconnected system. While previous phases analyse individual pages or groups of pages, this phase focuses on the relationships between them: how internal linking flows, how authority distributes, and whether the URL structure is consistent.

Crawl depth

Using the crawl data from Phase 1, analyse the depth distribution of your pages. The general rule is that no strategic page should be more than 3 clicks from the homepage. In practice, e-commerce sites with multiple category levels often have products at 5-6 clicks of depth, which significantly reduces the likelihood that Googlebot will crawl them frequently and that they will accumulate internal PageRank.

Orphan pages

Identify pages that receive no internal links. These pages can only be discovered by Google through the sitemap or external links, making them vulnerable to indexation problems. Screaming Frog allows filtering pages with zero internal inlinks. Cross-reference this list with GSC performance data: if an orphan page has impressions or clicks, it is being found through other means but would be more effective with internal linking.

Analyse which pages receive the most internal links and whether that distribution aligns with your business priorities. It is common to find that legal pages (terms of service, privacy policy) receive more internal links than service or product pages because they appear in the footer across the entire site. This is not necessarily a problem, but it does indicate that strategic pages need additional internal linking from relevant content.

URL consistency

This verification, which Google’s John Mueller described as “the biggest factor in technical SEO,” requires checking that the same URL appears identically in four locations: internal links, declared canonical, XML sitemap, and structured data. Export from Screaming Frog the list of URLs with their canonical and compare with sitemap URLs. The most frequent discrepancies are: inconsistent trailing slash (with and without /), inconsistent protocol (http vs https in residual internal links), and parameters added by the CMS or analytics tools.

URL structure

Evaluate whether URLs follow a consistent convention, are descriptive, and do not contain unnecessary parameters. SEO-optimal URLs are short, contain content keywords, use hyphens as separators, and maintain a logical hierarchy reflecting the site architecture.

How to produce a prioritised final report

The final phase is where the audit becomes an actionable document. A report listing 300 errors without prioritisation is useless. A report identifying the 15 issues that most impact organic traffic and ordering them by effort-to-impact ratio is what enables a development team to act effectively.

Prioritisation matrix

Each audit finding should be classified on two axes: SEO impact (high, medium, low) and implementation effort (high, medium, low). High-impact, low-effort problems are implemented first. High-impact, high-effort problems are scheduled for the next development sprint. Low-impact problems, regardless of effort, are documented but not prioritised.

Impact criteria

Impact is not measured by the theoretical severity of the error but by its actual effect on traffic and conversions. A canonical error affecting 4,000 product pages with active organic traffic has more impact than 200 broken links on URLs that never received traffic. To determine impact, cross-reference technical findings with GSC and Google Analytics performance data: do the affected pages generate traffic? Do they have ranking potential for relevant keywords?

Report structure

An effective audit report has five sections. The executive summary (1 page) with the 3-5 most critical findings and their estimated traffic impact. The detailed diagnosis organised by audit phase, with screenshots and data supporting each finding. The prioritisation matrix with all errors classified. The action plan with specific tasks, the person responsible, and the estimated timeline. And the tracking KPIs: the metrics that will be measured 30, 60, and 90 days after implementing corrections.

Communicating to non-technical teams

If the report is aimed at leadership or marketing, translate technical findings into business impact. “4,200 pages with incorrect canonicals” does not communicate urgency. “4,200 product pages competing against each other in Google, diluting organic traffic for the category generating 40% of revenue” does. According to Moz, audits that translate findings into business impact are 3x more likely to have their recommendations implemented.

Post-audit follow-up

The audit does not end with the report. Establish a verification calendar to confirm that implemented fixes have had the expected effect. After canonical corrections, verify in GSC that duplicate URLs are progressively reducing. After performance improvements, check in CrUX that Core Web Vitals are improving in field data (not just in lab data). After architecture changes, monitor the depth distribution in the next complete crawl.

The difference between an audit that produces results and one that remains in an archived PDF is not in the sophistication of the analysis. It is in the clarity of the action plan and in the disciplined follow-up of its implementation. A mediocre audit well executed always outperforms a brilliant audit that nobody implements.

How do you perform a technical SEO audit?

A technical SEO audit follows 5 phases: complete site crawl using tools like Screaming Frog, indexation analysis in Google Search Console, Core Web Vitals evaluation, architecture and internal linking review, and generation of a prioritised report with errors ranked by business impact.

Sources and references

  1. Google Search Console Help (support.google.com)
  2. Screaming Frog SEO Spider (screamingfrog.co.uk)
  3. Web Vitals (web.dev)

FAQ about how to do a technical SEO audit

How long does a full SEO audit take?

A complete technical SEO audit for a mid-sized site (500-5,000 URLs) requires between 15 and 25 hours of work spread over 1-2 weeks. The initial crawl takes 2-4 hours, indexation and coverage analysis 3-5 hours, performance evaluation 2-3 hours, architecture review 4-6 hours, and writing the prioritised report 4-6 hours. Larger sites may need 40+ hours.

Can I do an SEO audit myself or do I need a professional?

You can perform a basic audit with free tools like Google Search Console, PageSpeed Insights, and the free version of Screaming Frog (limited to 500 URLs). However, for sites with over 5,000 pages, JavaScript rendering issues, multilingual architecture, or recent migrations, a professional will provide diagnostics that automated tools cannot, especially in prioritisation and action planning.

How often should I run an SEO audit?

The recommended frequency depends on the size and pace of the site. Sites with daily publishing or frequent changes need quarterly audits. Static sites with few updates can work with semi-annual audits. After any migration, redesign, or CMS change, an immediate audit is mandatory regardless of the regular schedule.

What are the minimum tools needed for an SEO audit?

The minimum viable stack includes three free tools: Google Search Console for indexation and coverage data, PageSpeed Insights for performance metrics and Core Web Vitals, and Screaming Frog (free version up to 500 URLs) for crawling and technical error detection. With these three tools you can cover the fundamental phases of an audit. For larger sites, you will need the paid version of Screaming Frog or alternatives like Sitebulb.