Skip to main content
Technical SEO 9 min

Site Architecture for SEO: Structure That Ranks | Ighenatt

Discover how site architecture affects technical SEO: URL structure, page depth, silo and flat models, and how to design a structure Googlebot can crawl effi...

EG

Elu Gonzalez

Author

What Site Architecture Is and Why Google Needs It

Imagine your website is an apartment building. Googlebot is the postman who has to deliver mail to each floor. If the building has no lift, the signage is incomplete and some corridors do not connect to others, the postman will waste a lot of time — or simply never reach some doors. Site architecture is exactly that: the building plan that determines how Google moves around your site.

The problem with many sites is that nobody thought about the plan before building. Pages, sections and categories were added without any overarching logic, until the resulting structure became a maze.

Adrien Menard, CEO of Botify, quantified this in 2019 with a figure that still circulates in professional technical SEO debates: more than 50% of pages on enterprise sites are simply not crawled by search engines. The main cause is not speed or content — it is architecture.

Site architecture comprises three concrete things:

  • How pages are organised into hierarchies (homepage → categories → subcategories → detail pages)
  • How they link to each other (which pages point to which others)
  • How those decisions are reflected in the URL structure

These three variables determine two things Google cares about greatly: how many pages it can crawl on each visit (the crawl budget) and how much authority those pages distribute to each other through internal linking.


Flat vs. Deep Structure: When to Use Each

A site’s depth is measured in clicks from the homepage. A page 1 click away is in the main menu. Two clicks away, in a category. Five clicks away, buried in a subcategory of a subcategory of a category.

There are two extreme models and many intermediate points between them.

Flat Architecture

All pages are a few clicks from the homepage. For a Barcelona restaurant with 15 pages (menu, reservations, history, team, contact…), flat architecture is natural: a main menu with 6-7 sections, each pointing directly to the corresponding page.

Concrete advantages:

  • Googlebot reaches all pages in few visits
  • Homepage authority flows directly to strategic pages
  • Users also navigate without friction

Deep Architecture

The more pages a site has, the harder it becomes to keep them all a few clicks away. An online store with 3,000 products needs categories, subcategories and filters that inevitably add depth. The risk here is that important products end up 5-6 clicks from the homepage and Google crawls them infrequently.

Google Search Central’s official documentation is clear: important pages must be accessible in few clicks from the homepage. It does not specify exactly how many, but the industry’s practical recommendation is not to exceed 3-4 clicks for any page you want Google to actively crawl.

For a fashion ecommerce with 500 active products, the optimal structure would be:

Homepage (0 clicks)
  → /womens-clothing/ (1 click)
    → /womens-clothing/dresses/ (2 clicks)
      → /womens-clothing/dresses/blue-linen-dress/ (3 clicks)

Four levels for any product. Manageable. If the same store adds size, colour and occasion filters as indexable URLs, it can end up with pages 6-7 clicks away that Google never sees.

The Practical Rule

For sites with fewer than 1,000 pages: flat architecture, maximum 3-4 clicks for any page. For sites with more than 10,000 pages (large ecommerce, content portals, marketplaces): accept the depth but ensure main categories are 1-2 clicks away and products or articles are 3-4. Beyond 4 clicks, each page added has diminishing returns in crawl terms.


Designing SEO-Friendly URLs: Principles and Mistakes

Here comes the counterintuitive point many SEOs do not expect: the folder structure in the URL matters considerably less than most people think.

John Mueller, Search Advocate at Google, explained this precisely in a March 2022 Webmaster Hangout:

“For us, we don’t really care so much about the folder structure, we focus essentially on the internal linking. It’s really like from the homepage or from the main page — how quickly can we get to that specific page?”

This has a direct implication: the URL /seo-audit/ does not rank worse than /services/seo/technical/audit/ for being “shorter” or having “less hierarchy.” What determines a page’s importance to Google is how many clicks separate it from the homepage via internal links, not how many directories the URL has.

What does matter in URLs:

Readability and consistency. URLs should describe the content they contain, using words (not IDs or parameters) and hyphen separators (-), not underscores. /what-is-seo/ is better than /what_is_seo/ or /p?id=347.

No unnecessary parameters. Search filters, user sessions and campaign parameters added directly to the URL create duplicate versions of the same content. Google can end up crawling /products/?colour=blue&size=M&sort=price as an independent URL, multiplying the number of pages it has to visit without any of them adding value.

Canonical tags and deduplication. When the same page is accessible from multiple URLs (with and without www, with session parameters), you need to establish the canonical URL to prevent Google from distributing the crawl budget across duplicate versions of the same content.

Avoid frequent changes. Every time a page’s URL changes, if there is no correct 301 redirect, Google loses the crawl history for that URL. A poorly executed web migration can cause traffic losses that take months to recover from.


Architecture Models: Silo, Hub-and-Spoke, Flat

Beyond “flat vs. deep,” SEO practitioners work with three organisational models that have different implications for internal linking.

Silo Model

Content is organised into isolated thematic groups. Internal linking flows preferentially within the same silo: the category page links to subcategory pages, subcategory pages link to detail pages, and linking between separate silos is kept to a minimum.

The logic behind the silo model is topical authority consolidation. If a law firm has a criminal law section and an employment law section, keeping them as separate silos with their own internal linking can help Google understand the firm has differentiated authority in each area.

The practical limitation is rigidity. Pure silos are difficult to maintain when content has natural connections across thematic areas. Forcing that separation can result in an artificial user experience.

Hub-and-Spoke Model

There are “hub” pages that function as thematic centres and “spoke” pages that develop related subtopics and link back to the hub. This is the model behind pillar content or topic cluster strategies.

A hub might be a comprehensive guide on “SEO for ecommerce.” The spokes would be specific articles on page speed, product URL structure, product schema, variant management… All link to the hub and the hub links to all of them.

This model works well for content sites (blogs, resource portals) because it reinforces topical authority without imposing the rigidity of the silo.

Flat Architecture

For most SMEs with sites of 20-100 pages, flat architecture is the most practical and sufficient option. A main menu that links directly to the most important pages. Few intermediate categories. Maximum accessibility from the homepage.

A restaurant does not need a silo for “winter menu vs summer menu.” A dental clinic does not need a hub-and-spoke for implantology. Architectural complexity should be justified by content volume, not by the aspiration to appear more sophisticated.


How to Plan Architecture Before Building

Site architecture is designed before development, not after. Changing it retrospectively is possible but costly: it requires mass redirects, menu and internal link updates, and time for Google to process the changes.

Step 1: Content Inventory

Before drawing any site map, list all the content the site will have. For each page, define:

  • Which primary keyword it will target
  • Which specific audience will need it
  • Whether it has a relationship with other pages in the inventory

This inventory automatically reveals the natural groupings and depth the site will need.

Step 2: Hierarchy Map

With the inventory in hand, group pages into categories and define how many levels you need. The rule is: use the minimum number of levels needed for the structure to be comprehensible. Every level added is one more click Googlebot has to make and one more step users have to navigate.

Tools like Whimsical, Miro or even a spreadsheet are sufficient for this exercise. You do not need specialist software — you need clarity about the grouping logic.

Step 3: Internal Linking Definition

The page hierarchy determines automatic internal linking (menu, breadcrumbs, category listings). But internal linking also includes manual links within content, and these have the greatest impact on authority distribution.

Define which pages are strategic (the ones you most want to rank) and ensure they receive links from multiple pages on the site, not just from the main menu. A service page that receives 10 internal links from related blog articles has stronger importance signals for Google than one that only appears in the menu.

Step 4: Managing Low-Value Technical URLs

Any site generates URLs that add no value: internal search results pages, ecommerce filters, session parameters, pagination pages with no unique content. These URLs consume crawl budget without contributing value.

The decision about each type of technical URL must be made before launching the site, not once the problem is already in production:

URL TypeRecommended Decision
Refinement filters (colour, size)noindex or block in robots.txt
Pagination (?page=2)Canonical to the first page or noindex
Internal search resultsBlock in robots.txt
UTM parameters on indexable URLsCanonical to the clean URL

For ecommerce sites with thousands of products and variants, this management is the difference between Google crawling 5% or 80% of the real catalogue.


Architecture for Multilingual Sites

Sites with content in multiple languages add a layer of complexity to architecture: each page potentially has three or four versions (one per locale), and Google needs to know which version to show each user based on their language and location.

The technical implementation of this is hreflang tags, but hreflang tags are just the mechanism. Architecture is the prior decision.

URL Structure Options for Multilingual Sites

Subdirectories per locale (the most common): ighenatt.es/en/, ighenatt.es/ca/. All domain authority concentrates on a single domain. Google handles these well and they are the official documentation’s recommended option for most cases.

Subdomains: en.ighenatt.es, ca.ighenatt.es. Technically works but distributes authority across subdomains and requires more Search Console verification work.

Separate domains: ighenatt.com for English, ighenatt.es for Spanish. Only makes sense when there is a very clear business reason to maintain separate domain presences, as it involves building authority from scratch on each domain.

What Makes Multilingual Architecture Fail

The most common mistake is not the choice of URL structure, but inconsistency in hreflang linking. Each URL in one locale must have hreflang tags pointing to its equivalents in the other locales, and those equivalents must have return tags pointing back to the first. If an English page points to its Catalan version but the Catalan version does not point back, Google may ignore the relationship.

The second frequent mistake is translating content without adapting URLs. A URL like /services/barcelona-seo/ makes sense in English. Its Spanish equivalent should be /servicios/seo-barcelona/, not /services/barcelona-seo/ with Spanish content inside. Google uses the URL as a locale signal, and an English-language URL inside a Spanish section generates contradictory signals.


Architecture Audit: Tools and Metrics

Once the site is in production, architecture is audited by measuring precisely what Google is crawling and comparing that with what it should be crawling.

The Data Point That Changes Everything: Vehicle Marketplace Case Study

Botify documented the case of a US online vehicle marketplace with 10 million pages on the server. The initial situation was this: Google was only crawling 50,000 of those 10 million URLs — 0.5%. 98% of the crawl budget was being consumed on valueless URLs: refinement filters with infinite parameters, URL permutations duplicating the same car listing with different sort orders, pagination pages Google visited repeatedly without discovering new content.

The interventions were three:

  1. robots.txt update to block valueless refinement URLs
  2. 50% reduction in total URLs known to Google via canonical tags and noindex
  3. Internal linking improvements to help Google reach relevant vehicle listings more easily

Result in three months: crawl rose from 0.5% to 9.5% (+19x), and organic traffic doubled from 40,000 to 80,000 weekly visits. The first improvements were visible within six weeks.

This case illustrates something worth underlining: the problem was not a lack of pages, but an excess of valueless pages preventing Google from reaching the ones that mattered.

Architecture Audit Metrics

Click depth from the homepage. Tools like Screaming Frog or Sitebulb crawl the site simulating Googlebot’s behaviour and show how many clicks each URL is from the homepage. A healthy distribution has most strategic pages at 1-3 clicks.

Crawled URLs vs. URLs submitted in sitemap. Google Search Console shows how many URLs Google has crawled versus how many URLs appear in your sitemap. A large discrepancy indicates pages Google is not reaching.

Crawled low-value pages. A full crawl with Screaming Frog identifies how many of the site’s URLs are parameters, duplicates, filter pages or pagination that Google is crawling unnecessarily. This figure divided by total crawled URLs gives the proportion of “wasted” crawl budget.

Internal link distribution. Tools like Ahrefs, Screaming Frog or Sitebulb show how many internal links each URL receives. Strategic pages should receive more internal links than supporting pages. If the best internally-linked pages are the privacy policy and the legal notice, there is an authority distribution problem.

Gary Illyes and the Smarter Crawl

On LinkedIn in April 2024, Gary Illyes (Google analyst) shared a relevant observation about how crawl behaviour has evolved:

“We crawl roughly the same amount as before, however the scheduling has become smarter and we focus more on URLs that have a higher chance of deserving the crawl.”

The practical implication is direct: if your site has many low-value URLs, Google will prioritise them less, which means fewer visits to strategic URLs. Optimising architecture does not just help Google reach more pages — it also influences how frequently it returns.


A well-designed architecture is invisible, but you feel it in the crawl. If you want to know what percentage of your site is being crawled and which structure makes most sense for your case, we evaluate this in every initial SEO audit at Ighenatt. Tell us about your situation.

Share this article

If you found this content useful, share it with your colleagues.

Frequently Asked Questions

¿Con qué frecuencia publican contenido nuevo?

Publicamos artículos nuevos semanalmente, enfocados en las últimas tendencias de SEO técnico, casos de estudio reales y mejores prácticas. Suscríbete a nuestro newsletter para no perderte ninguna actualización.

¿Los consejos son aplicables a cualquier tipo de sitio web?

Nuestros consejos se adaptan a diferentes tipos de sitios: ecommerce, blogs, sitios corporativos y aplicaciones web. Siempre indicamos cuándo una técnica es específica para cierto tipo de sitio o requerimientos técnicos.

¿Puedo implementar estas técnicas yo mismo?

Muchas técnicas básicas puedes implementarlas tú mismo siguiendo nuestras guías paso a paso. Para optimizaciones avanzadas o auditorías completas, recomendamos consultar con especialistas en SEO técnico como nuestro equipo.

¿Ofrecen servicios de consultoría personalizada?

Sí, ofrecemos servicios de consultoría SEO técnica personalizada, auditorías completas y optimización integral. Contáctanos para discutir las necesidades específicas de tu proyecto y cómo podemos ayudarte.

Stay updated

Receive the latest articles, tips and strategies about SEO, web performance and digital marketing in your email.

We send a newsletter every week, and you can unsubscribe at any time.

Tags: #site architecture #technical SEO #URL structure #SEO silo #crawl budget #internal linking
EG

Elu Gonzalez

SEO Expert & Web Optimization