Is there a limit to the number of duplicate pages pointing to a rel='canonical ' primary?
-
We have a situation on twiends where a number of our 'dead' user pages have generated links for us over the years. Our options are to 404 them, 301 them to the home page, or just serve back the home page with a canonical tag.
We've been 404'ing them for years, but i understand that we lose all the link juice from doing this. Correct me if I'm wrong?
Our next plan would be to 301 them to the home page. Probably the best solution but our concern is if a user page is only temporarily down (under review, etc) it could be permanently removed from the index, or at least cached for a very long time.
A final plan is to just serve back the home page on the old URL, with a canonical tag pointing to the home page URL. This is quick, retains most of the link juice, and allows the URL to become active again in future. The problem is that there could be 100,000's of these.
Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?)
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
Thanks
-
I'll add this article by Rand that I came across too. I'm busy testing the solution presented in it:
https://moz.com/blog/are-404-pages-always-bad-for-seo
In summary, 404 all dead pages with a good custom 404 page so as to not waste crawl bandwidth. Then selectively 301 those dead pages that have accrued some good link value.
Thanks Donna/Tammy for pointing me in this direction..
-
In this scenario yes, a customized 404 page with a link to a few top level ( useful) links would be better served to both the user and to Google. From a strictly SEO standpoint, 100,000 redirects and or canonical tags would not benefit your SEO.
-
Thanks Donna, good points..
We return a hard 404, so it's treated correctly by google. We are just looking at this from a SEO point of view now to see if there's any way to reclaim this lost link juice.
Your point about looking at the value of those incoming links is a good one. I suppose it's not worth making google crawl 100,000 more pages for the sake of a few links. We've just starting seeing these pop up in Moz Analytics as link opportunities, and we can see them as 404's in site explorer too. There are a few hundred of these incoming links that point to a 404, so we feel this could have an impact.
I suppose we could selectively 301 any higher value links to the home page.. It will be an administrative nightmare, but doable..
How do others tackle this problem. Does everyone just hard 404 a page when that loses the link juice for incoming links to it..?
Thanks
-
Hi David,
When you say "we've been 404'ing them for years", does that mean you've created a custom 404 page that explains the situation to site visitors or does it mean you've been letting them naturally error and return the appropriate 404 (page not found) error to Google? It makes a difference. If the pages truly no longer exist and there is no equivalent replacement, you should be letting them naturally error (return a 404 return code) so as not to mislead Google's robots and site visitors.
Have you looked at the value of those incoming links? They may be low value anyway. There may be more valuable things you could be doing with your time and budget.
To answer your specific questions:
_Q1) Is it a problem to have 100,000 URLs pointing to a primary with a rel=canonical tag? (Problem for Google?) _
Yes, if those pages (or valuable replacements) don't actually exist. You'd be wasting valuable crawl budget. This looks like it might be especially true in your case given the size of your site. Check out this article. I think you might find it very helpful. It's an explanation of soft 404 errors and what you should do about them.
Q2) How long does it take a canonical duplicate page to become unique in the index again if the tag is removed? Will google recrawl it and add it back into the index? Do we need to use WMT to speed this process up?
If the canonical tag is changed or removed, Google will find and reindex it next time it crawls your site (assuming you don't run out of crawl budget). You don't need to use WMT unless you're impatient and want to try to speed the process up.
-
Thanks Sandi, I did.. It's a great article and it answered many questions for me, but i couldn't really get clarity on my last two questions above..
-
Hey David
Check this MOZ Blog post about Rel=Canlonical appropriately named Rel=Confused?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I add canonical links to pages that are redirected?
Hello! I am a little confused concerning canonical links. I have several URLs that all access my page, but I redirect them all. A lot of places I am told to redirect them or use canonicals. Other places, I read that I should always use canonicals. What is the right way for me? If I should use canonicals as well as redirects, which links should I do this on? I redirect my pages like this: http to https:
On-Page Optimization | | hermanok
http://example.com -> https://example.com www to non-www:
https://www.example.com -> https://example.com Remove trailing slashes
https://example.com/ -> https://example.com Would-be 404-requests to index.php?p=$1
https://example.com/home -> https://example.com/index.php?p=home ( show as https://example.com/home ) Example:
http://www.example.com/home/ -> http://www.example.com/home/ -> https://example.com/home/ -> https://example.com/home -> https://example.com/index.php?p=home ( shows as https://example.com/home ) Thank you!0 -
Duplication in landing page
This is driving me mad, I have a site that for some reason google and moz pick up the landing page as a duplicate. They see "mysite/" and "mysite/index.html" as two different pages and giving me warnings for duplication. I have no 301 included at this time and I am using foundation as the base. This is occurring both on a localhost test bed and live....... anyone got an idea how to correct.
On-Page Optimization | | AndyBirtles0 -
Rel Conical - Mobile page
I have two pages that have essentially the same content, same page title etc. however one is the mobile version of the other. Is it appropriate to use the rel canonical tag with these two pages? So the pages are: www.example.com/product www.example.com/mobile/product If rel canonical is not appropriate what, if anything should I do?
On-Page Optimization | | cbarron0 -
Impact of number of outgoing links on Page Rank of an optimized page?
What is the current best practice on preferred number of outbound links on a page you are trying to rank with: According to online resources form a pure page rank perspective a high number of outbound follow links can have a negative impact not only on child pages but also the page itself
On-Page Optimization | | thomaspro
http://pr.efactory.de/e-outbound-links.shtml Other resources suggest that particularly placing high quality outbound links on a page (nofollow) increases the trust and authority of a page Are there any other elements to keep in mind? Is the best practice to avoid any follow links on a page you want to rank well in Google for? Thanks /T0 -
Home Page Keywords not Ranking and Assigned to Inside Pages
Hi, thank you for taking the time to read this. We have a few websites with the same problem. I will use http://www.prepared-meals.com as an example: The home page was ranking on page one for keyword "Prepared Meals". The site is about 6 months old. We use the Moz page optimizer on all pages of our websites to score an A rating. Recently we found the home page is no longer showing up in search results and the keyword "prepared meals" now points to an inside page that is not relevant: http://www.prepared-meals.com/Senior-Meals/Moms-Meals-Reviews.html this page shows up for Prepared Meals around page 15 in Google results. We have read keywords in the URL might be the issue, even though the page optimizer in MOZ says to do that. We are wondering if this is the issue or there is some other problem we are not aware of. Again, thank you for you for your time. -Craig
On-Page Optimization | | CraigSWD0 -
"Issue: Duplicate Page Content " in Crawl Diagnostics - but these pages are noindex
Saw an issue back in 2011 about this and I'm experiencing the same issue. http://moz.com/community/q/issue-duplicate-page-content-in-crawl-diagnostics-but-these-pages-are-noindex We have pages that are meta-tagged as no-everything for bots but are being reported as duplicate. Any suggestions on how to exclude them from the Moz bot?
On-Page Optimization | | Deb_VHB0 -
How Should I Fix Duplicate Content in Wordpress Pages
In GWMT i see google found 41 duplicate content in my wordpress blog. I am using Yoast SEO plugin to avoid those type of duplicates but still the problem was stick.. You can check the screenshot here - http://prntscr.com/dxfjq Please help..
On-Page Optimization | | mamuti0 -
Duplicate Page Title
Wordpress Category pagination causes duplicate page title errors (ie. when there are so many posts in the category, it paginates them), is this a problem? Your tool is reporting it as a problem... but ProPhoto (my Wordpress provider say it is not a problem). Here are the 2 URL's with the same page title: http://www.lisagillphotography.co.uk/category/child-photography/ http://www.lisagillphotography.co.uk/category/child-photography/page/2/
On-Page Optimization | | LisaGill0