I was three hours into a technical audit for a client — a mid-size ecommerce brand selling outdoor gear — when I noticed something that made me stop scrolling. Their product pages had been indexed multiple times. Not twice. Not three times. Some products were showing up in Google's index with four or five separate URLs, each one slightly different. There was the clean URL, the one with session parameters, the one with tracking parameters from their email campaigns, the one generated by their faceted navigation filters, and sometimes a mobile-specific version that shouldn't have existed at all because they'd moved to responsive design two years prior. The site had roughly 2,000 products. Multiply that by four or five URL variations each, and you start to see the scope of the mess.
Key Takeaways
- What a Canonical Tag Actually Is
- How Canonicals Affect Link Equity
- Cross-Domain Canonicals: A Whole Different Beast
- Self-Referencing Canonicals and Why They Matter
- When Canonicals Conflict with Other Signals
- Canonicals vs. Redirects: When to Use Which
The thing that really caught my attention wasn't the duplicate content itself. That's common. Lots of sites have it. What got me was that this client had been building links for over a year. They'd invested real money and real effort into a link building campaign — guest posts, digital PR, resource page outreach, the whole thing. And when I pulled up their backlink profile in Ahrefs, I could see that those hard-won links were scattered across all these different URL variations. Some bloggers had linked to the parameterized version they'd landed on through an email campaign. Others had linked to the filtered URL they'd found through the site's navigation. The link equity, which should have been consolidated on the primary product pages, was fragmented across URLs that shouldn't have been indexed in the first place.
This is where canonical tags enter the picture, and honestly, this is where things start to get confusing even for people who've been doing SEO for years.
What a Canonical Tag Actually Is
A canonical tag is a snippet of HTML that you place in the <head> section of a webpage. It looks like this:
<link rel="canonical" href="https://example.com/products/hiking-boots" />
What it's telling search engines, in plain terms, is: "Hey, I know you might find this page at multiple URLs, but the real version — the one you should pay attention to, the one you should index, the one that should get credit for any links — is this URL right here." It's a declaration of the preferred version. Google sometimes calls it the "canonical URL" and everything else is a "duplicate" or "alternate." There's more worth reading about, and 301 vs 302 Redirects: Impact on Link Equity is a great place to start.
The concept seems simple enough. You've got duplicate pages, you pick the one you want to be the canonical, you slap the tag on all the duplicates pointing to the preferred version, and Google figures it out. Except it doesn't always work that cleanly, and the reasons why are worth understanding if you care about where your link equity ends up.
First, and this trips people up constantly — the canonical tag is a hint, not a directive. Google has said this directly. Multiple times. In their own documentation they describe it as a "strong signal" that they "take into account," but they also reserve the right to ignore it. This is entirely different from, say, a noindex tag, which Google generally treats as a hard instruction. With canonicals, you're making a suggestion, and Google will evaluate whether your suggestion makes sense. If Google thinks a different URL is the better canonical, they'll pick that one instead, regardless of what your tag says.
I've seen this happen in practice more than a few times. A client sets up canonical tags correctly — or what they think is correctly — and then checks Google Search Console a few weeks later only to find that Google has selected a completely different canonical URL. The "User-declared canonical" and the "Google-selected canonical" don't match. It's one of those moments that makes you question whether you understand anything about how search engines actually work.
How Canonicals Affect Link Equity

This is the part that matters most for anyone doing link building, and it's the part where I see the most confusion. When Google identifies a set of duplicate pages and selects a canonical URL — whether that matches your declared canonical or not — the general understanding is that link signals from all the duplicate URLs get consolidated onto the canonical version. In theory, this is great. It means if someone links to the parameterized version of your page, that link equity should flow to the canonical URL.
John Mueller from Google has confirmed this behavior on multiple occasions. In a 2020 Webmaster Hangout, he said something to the effect of: when we determine that pages are duplicates and we pick a canonical, we forward the signals from the alternate URLs to the canonical. Links are among those signals. So if page A and page B are duplicates, and we pick page A as the canonical, links pointing to page B should benefit page A.
Sounds great. Problem solved, right? Well, not quite. There are several caveats, edge cases, and weird behaviors that make this less straightforward than it sounds.
For starters, consolidation only happens if Google actually recognizes the pages as duplicates. If two pages are similar but not identical — say they have mostly the same product description but one includes reviews and the other doesn't — Google might decide they're not actually duplicates. In that case, your canonical tag might get ignored, and the link equity stays wherever it landed. You're left with two separate pages, each with a fraction of the links, competing against each other. Some people call this keyword cannibalization, and it's a real headache. The canonical tag was supposed to prevent it, but it only works when Google agrees with your assessment of what constitutes a duplicate. This is closely related to what we cover in How to Handle Links During a Site Migration.
There's also the question of timing. When you add a canonical tag to a page, it doesn't take effect instantly. Google needs to recrawl both the duplicate page (to see the canonical tag) and the canonical page (to understand that it's the target). Depending on your site's crawl frequency, this could take days, weeks, or in some cases months. During that interim period, your link equity is still fragmented. I've worked with sites where we added canonical tags and didn't see full consolidation in Search Console for six or seven weeks. If you're running a time-sensitive campaign and building links while your canonicals are still being processed, things can get messy.
Cross-Domain Canonicals: A Whole Different Beast
It gets even more interesting when you're dealing with cross-domain canonicals. Let's say you syndicate your content to another website. Medium, LinkedIn, an industry publication — wherever. The syndication partner publishes your article on their domain, and you ask them to include a canonical tag pointing back to the original on your site. The HTML would look something like:
<link rel="canonical" href="https://yoursite.com/blog/original-article" />
In theory, this tells Google that the version on the syndication partner's site is a copy, and all the signals — including any links the syndicated version might attract — should be attributed to your original. In practice? I've had mixed results with this. Cross-domain canonicals seem to be treated with more skepticism by Google than same-domain ones. I've seen cases where Google indexed both versions anyway, treating them as separate pages rather than duplicates. The syndicated version sometimes even outranked the original, which is exactly the nightmare scenario you were trying to avoid.
When cross-domain canonicals work, they're powerful. Any links that journalists or bloggers point at the syndicated version on the high-authority partner site theoretically get attributed back to you. But the "theoretically" is doing a lot of heavy lifting in that sentence. I wouldn't bet my entire link building strategy on cross-domain canonicals behaving the way the documentation says they should. You need to monitor it and have a backup plan.
Self-Referencing Canonicals and Why They Matter
Here's something that a lot of site owners overlook: every page should have a self-referencing canonical tag. Even pages that aren't duplicates. Your homepage should have:
<link rel="canonical" href="https://example.com/" />
And your blog post should have:
<link rel="canonical" href="https://example.com/blog/my-post" />
Why? Because almost any page can be accessed through multiple URL variations even without obvious duplication issues. Query parameters, trailing slashes, uppercase letters in the path, HTTP versus HTTPS, www versus non-www — all of these can create technically different URLs that load the same content. Without a self-referencing canonical, you're leaving it up to Google to figure out which version is preferred, and Google doesn't always make the choice you'd want.
I audited a SaaS company's blog last year and found that about 40% of their pages had no canonical tag at all. The other 60% had them because their CMS added them automatically, but the configuration was wrong — the canonical URLs used HTTP instead of HTTPS, even though the site had migrated to HTTPS two years earlier. So every page was essentially telling Google, "The real version of this page is at an HTTP URL that doesn't exist anymore." Google was mostly smart enough to figure out the right URL anyway, but "mostly" isn't "always," and a handful of pages were showing the HTTP version in search results.
Those pages had backlinks pointing to the HTTPS version, and there was this weird mismatch where the indexed URL and the URL with the links weren't the same. See also our post on Rel Attributes Explained: nofollow, sponsored, ugc for more on this.
Self-referencing canonicals are a small thing that prevents a surprisingly large category of problems. Just make sure they're correct — right protocol, right domain, right path, right trailing slash convention.
When Canonicals Conflict with Other Signals
Google doesn't look at canonical tags in isolation. They evaluate them alongside a bunch of other signals to determine the canonical URL, and when those signals conflict, things get unpredictable. Here are some of the signals Google considers:
The canonical tag itself, obviously. Internal links — which URL do your own internal links point to? If your canonical says URL A but all your internal links point to URL B, that's a conflicting signal. The sitemap — which URL is listed in your XML sitemap? HTTPS versus HTTP preference. Redirects. And the actual content similarity between the pages in question.
I've seen situations where the canonical tag points to one URL, the internal links point to another, and the sitemap lists a third. In those cases, Google basically has to make a judgment call, and they don't always pick the one you intended. Cleaning up these conflicting signals is one of the most impactful things you can do in a technical SEO audit. Make sure your canonical tags, internal links, sitemap entries, and redirect chains all agree on which URL is the preferred version of each page. It sounds obvious, but on large sites with years of accumulated technical debt, these inconsistencies are everywhere.
I worked with a publishing site that had 15,000+ articles, and when we ran a crawl comparing canonical URLs to sitemap URLs, about 2,200 of them didn't match. Some sitemaps listed old URL structures that had been changed during a previous migration, while the canonical tags pointed to the new URLs. Others had the opposite problem. It took us three weeks to reconcile everything, and the effect on crawl efficiency and indexation was noticeable within a couple of months. Pages that had been stuck in Google's "Discovered – currently not indexed" limbo started showing up. Rankings for several key pages improved, which we attributed partly to link signals finally consolidating properly once Google could clearly identify the canonical version of each page.
Canonicals vs. Redirects: When to Use Which
This is a question I get asked a lot, and the answer is... it depends. Which is the most annoying answer in SEO but also the most honest one.
A 301 redirect physically sends the user (and the search engine crawler) from one URL to another. The old URL becomes inaccessible — if someone types it in, they end up at the new URL. A canonical tag, by contrast, leaves both URLs accessible. Users can still visit both pages. The canonical just tells Google which one should get the credit. You might also find How to Use Hreflang Tags for International SEO useful here.
If you have true duplicates where there's no reason for the duplicate URL to exist or be accessible, a 301 redirect is usually the better choice. It's a stronger signal than a canonical tag. Google treats 301 redirects as directives (at least in terms of URL resolution), and the link equity transfer is well-documented and reliable. Redirects are also better for user experience because visitors won't land on a duplicate page — they'll get sent to the right one automatically.
Canonical tags make more sense when you need both URLs to remain accessible. The most common example is parameterized URLs. If someone is on your ecommerce site and they filter by color, the URL might change to /shoes?color=red. You don't want to redirect that URL because the user is actively using it. But you do want Google to treat /shoes as the canonical version and consolidate any link equity there. Another common case is print-friendly versions of pages, AMP pages (though AMP is less common now), and mobile-specific URLs on sites that haven't adopted responsive design.
One mistake I see frequently: using canonical tags as a substitute for redirects when redirects are clearly the right choice. Someone migrates their blog from /blog/post-title to /articles/post-title, and instead of setting up 301 redirects from the old URLs to the new ones, they just add canonical tags on the old URLs pointing to the new ones. Both pages are still live, both are crawlable, and now Google has to figure out the relationship instead of being told directly. This usually works eventually, but it's slower, less reliable, and leaves those old URLs hanging around in the index longer than necessary. If users are bookmarking the old URL or other sites are linking to it, those people are landing on a page that you've told Google isn't the real version. That's not ideal for anyone.
Checking and Debugging Canonical Issues
If you're wondering whether your canonical tags are working the way you think they are, Google Search Console is your best friend. The URL Inspection tool will show you both the "User-declared canonical" and the "Google-selected canonical" for any given page. If those two match, great. If they don't, you've got a problem worth investigating.
Screaming Frog is also invaluable for this. You can crawl your entire site and export a list of every page's canonical tag, then look for mismatches, missing tags, or canonicals pointing to non-existent URLs (a surprisingly common issue after site migrations). I usually set up a custom extraction in Screaming Frog to pull the canonical URL from every page's HTML, then compare it against the actual page URL in a spreadsheet. Any discrepancies get flagged for review.
For link equity specifically, Ahrefs lets you look at the backlink profile for specific URLs. If you suspect that link equity isn't being consolidated properly, you can check each URL variation and see which ones have backlinks. If the canonical URL has links and the duplicates don't, consolidation might already be working. If the links are spread across multiple URL variations, something might be off with your canonical setup — or Google might not have processed the changes yet.
There's a subtlety here that's easy to miss. Even when Google has correctly identified the canonical URL and is consolidating signals, third-party tools like Ahrefs and Moz don't always reflect that consolidation. They index the web independently from Google, and they report links based on the URLs they find, not based on Google's canonical selections. So you might see link equity appearing fragmented in Ahrefs even though Google has actually consolidated it. This can create a misleading picture if you're relying solely on third-party metrics to evaluate whether your canonical strategy is working. The URL Inspection tool in Search Console is closer to the ground truth. This connects to what we discuss in The Ultimate Guide to Internal Linking for SEO.
The Compounding Cost of Getting Canonicals Wrong
Here's what really bothers me about canonical issues: they're quiet problems. Your site doesn't break. You don't get error messages. Everything looks fine on the surface. But underneath, your link equity is being split across multiple URLs, your crawl budget is being wasted on duplicate pages, and your strongest pages aren't as strong as they should be. It's like having a slow leak in your tire. You can drive on it for a while without noticing, but eventually, something gives.
On that outdoor gear client I mentioned at the beginning — once we implemented proper canonical tags across all their product pages and set up 301 redirects for the truly unnecessary URL variations, it took about two months for everything to settle. But when it did, we saw organic traffic to their product pages increase by around 18%. We didn't build any new links during that period. The links were already there. They were just scattered across the wrong URLs. Consolidating them onto the right pages was enough to move the needle.
That's the part about canonicals that surprises me every time. Not the technical implementation — that's just HTML. It's the outsized impact that proper consolidation can have when you finally get it right. You're not creating new value. You're just making sure the value that already exists ends up where it's supposed to be. And somehow, years into doing this work, the effect still catches me off guard. I keep expecting it to be marginal, and it keeps being more than that. Maybe one day I'll stop being surprised by what a single line of HTML can do to a site's link profile, but honestly, I'm not sure that day is coming anytime soon.
Comments (0)
No comments yet. Be the first to share your thoughts!
Leave a Comment