Is there a limit to the number of duplicate pages pointing to a rel='canonical ' primary?
-
We have a situation on twiends where a number of our 'dead' user pages have generated links for us over the years. Our options are to 404 them, 301 them to the home page, or just serve back the home page with a canonical tag.
We've been 404'ing them for years, but i understand that we lose all the link juice from doing this. Correct me if I'm wrong?
Our next plan would be to 301 them to the home page. Probably the best solution but our concern is if a user page is only temporarily down (under review, etc) it could be permanently removed from the index, or at least cached for a very long time.
A final plan is to just serve back the home page on the old URL, with a canonical tag pointing to the home page URL. This is quick, retains most of the link juice, and allows the URL to become active again in future. The problem is that there could be 100,000's of these.
Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?)
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
Thanks
-
I'll add this article by Rand that I came across too. I'm busy testing the solution presented in it:
https://moz.com/blog/are-404-pages-always-bad-for-seo
In summary, 404 all dead pages with a good custom 404 page so as to not waste crawl bandwidth. Then selectively 301 those dead pages that have accrued some good link value.
Thanks Donna/Tammy for pointing me in this direction..
-
In this scenario yes, a customized 404 page with a link to a few top level ( useful) links would be better served to both the user and to Google. From a strictly SEO standpoint, 100,000 redirects and or canonical tags would not benefit your SEO.
-
Thanks Donna, good points..
We return a hard 404, so it's treated correctly by google. We are just looking at this from a SEO point of view now to see if there's any way to reclaim this lost link juice.
Your point about looking at the value of those incoming links is a good one. I suppose it's not worth making google crawl 100,000 more pages for the sake of a few links. We've just starting seeing these pop up in Moz Analytics as link opportunities, and we can see them as 404's in site explorer too. There are a few hundred of these incoming links that point to a 404, so we feel this could have an impact.
I suppose we could selectively 301 any higher value links to the home page.. It will be an administrative nightmare, but doable..
How do others tackle this problem. Does everyone just hard 404 a page when that loses the link juice for incoming links to it..?
Thanks
-
Hi David,
When you say "we've been 404'ing them for years", does that mean you've created a custom 404 page that explains the situation to site visitors or does it mean you've been letting them naturally error and return the appropriate 404 (page not found) error to Google? It makes a difference. If the pages truly no longer exist and there is no equivalent replacement, you should be letting them naturally error (return a 404 return code) so as not to mislead Google's robots and site visitors.
Have you looked at the value of those incoming links? They may be low value anyway. There may be more valuable things you could be doing with your time and budget.
To answer your specific questions:
_Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?) _
Yes, if those pages (or valuable replacements) don't actually exist. You'd be wasting valuable crawl budget. This looks like it might be especially true in your case given the size of your site. Check out this article. I think you might find it very helpful. It's an explanation of soft 404 errors and what you should do about them.
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
If the canonical tag is changed or removed, Google will find and reindex it next time it crawls your site (assuming you don't run out of crawl budget). You don't need to use WMT unless you're impatient and want to try to speed the process up.
-
Thanks Sandi, I did.. It's a great article and it answered many questions for me, but i couldn't really get clarity on my last two questions above..
-
Hey David
Check this MOZ Blog post about Rel=Canlonical appropriately named Rel=Confused?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
One more question about rel=canonical
I'm still trying to wrap my head around rel=canonical and its importance. Thanks to the community, I've been able to understand most of it. Still, I have a couple of very specific questions: I share certain blog posts on the Huffington Post. Here's an example: http://www.huffingtonpost.ca/cedric-lizotte/munich-travel-guide_b_13438956.html - Of course I post these on my blog as well. Here: http://www.continentscondiments.com/things-munich-classics/ - Obviously the HuffPo has a huge DA, and I'll never match it. However the original post is mine, on my blog, and not on the HuffPo. They wont - obviously - add a rel=canonical just for me and for the sake of it, they have a million other things to do. QUESTION: Should I add a rel=canonical to my own site pointing to the post on the HuffPost? What would be the advantage? Should I just leave this alone? I share blog posts on Go4TravelBlog too. Example: http://www.go4travelblog.com/dallmayr-restaurant-munich/ - but, once again, the original post is on one of my blogs. In this case, it's on another blog of mine: http://www.thefinediningblog.com/dallmayr-restaurant-in-munich/ QUESTION: Well it's pretty much the same! Should I beg Go4TravelBlog to add a rel=canonical pointing to mine? If they refuse, what do I do? Would it be better to add a rel=canonical from my site to theirs, or do I fight it out and have a rel=canonical pointing to my own post? Why? Thanks a million for your help!
On-Page Optimization | | cedriklizotte0 -
Page grader says we are keyword stuffing but we arn't. Page source shows different story.
Hi community! We have just run a page grader for the keyword 'LED Bulbs' on whichledlight.com and it comes up that we are keyword stuffing! However, a brief look at the source for the homepage and there's only 6 times that LED Bulbs pops up. We do have the non plural version of the word 'LED Bulb' on the page 27 times.. do we think that would contribute to the keyword stuffing? Thanks!!
On-Page Optimization | | TrueluxGroup0 -
E commerce Website canonical and duplicate content isssue
i have a ecomerce site , i am just wondering if any one could help me answer this the more info page can be access will google consider it as duplicate and if it does then how to best use the canonical tag http://domain.com/product-page http://domain.com/product-page/ http://domain.com/product-Page http://domain.com/product-Page/ also in zencart when link product it create duplicate page content how to tackle it? many thanks
On-Page Optimization | | conversiontactics0 -
Different pages for OS's vs 1 Page with Dynamic Content (user agent), what's the right approach?
We are creating a new homepage and the product are at different stages of development for different OS's. The value prop/messaging/some target keywords will be different for the various OS's for that reason. Question is, for SEO reasons, is it better to separate them into different pages or use 1 page and flip different content in based on the user agent?
On-Page Optimization | | JoeLin0 -
SEOmoz's On-page Checker upto date?
Helllo Mozzers, Just wondering if SEOmoz's on-page optimisation checker is upto date with google recent updates? If not... what do you suggest?
On-Page Optimization | | Prestige-SEO0 -
How to get rid of those duplicate pages
Hi eveyone Just got my first diagnostics report and i got 220 duplicate page titles and 217 duplicate content pages. I know why this has happened it is because i did play about a bit and did originally have:- www.mydomainname.com/index.php/alpha Then i change the page path to:- www.mydomainname.com/alpha Then i change the page path to:- www.mydomainname.com/category/alpha So now when i get crawled i have 3x duplicate page titles, descriptions and page content. Even when i have put 301 redirects to my preferred domain path. Which is hurting my seo, right? How do i stop theold domains from giving me these bad reports? The site is on Joomla Thanks guys Oujipickle
On-Page Optimization | | oujipickle0 -
Advice on why page isn't being indexed in google top 1000 for keyword
Hi, http://www.cgcomposites.com.au/composite-material.html This page isn't listed for keywords 'composite material' It has been live for a few weeks and gets grade A report for onpage optimisation? regards Michael
On-Page Optimization | | bluelilyseo0 -
Does anyone know why my Home page isn't visible in search terms?
If I type in my meta description for my homepage in google search or any of my keywords only my inner pages are returned in the results. I have a PR3 on the homepage so I don't think google is blocking my site and all my inner pages seem to show up.
On-Page Optimization | | Michael.Constable0