Rel canonical and duplicate subdomains
-
Hi,
I'm working with a site that has multiple sub domains of entirely duplicate content. So, the production level site that visitors see is (for made-up illustrative example):
Then, there are sub domains which are used by different developers to work on their own changes to the production site, before those changes are pushed to production:
Google ends up indexing these duplicate sub domains, which is of course not good.
If we add a canonical tag to the head section of the production page (and therefor all of the duplicate sub domains) will that cause some kind of problem... having a canonical tag on a page pointing to itself? Is it okay to have a canonical tag on a page pointing to that same page?
To complete the example...
In this example, where our production page is 123abc456.edu, our canonical tag on all pages (this page and therefor the duplicate subdomains) would be:
Is that going to be okay and fix this without causing some new problem of a canonical tag pointing to the page it's on?
Thanks!
-
Hi Bob,
That excellent question I'll have to look in to and confirm. More later. Thanks!
-
Is the subdomain data stored on the server as directories?
So for example, is the Moe.123abc456.edu data stored in a folder like 123abc456.edu/Moe
If so, you can simply have one robots.txt on your root domain, blocking those directories
Disallow: /Moe/
-
Well, Bob, it looks like you're right! I guess it will for sure see all the pages in
as the ones to remove and not
Also, how does that robots text not get pushed to production as the developer working on that branch completes his work and pushes it to production.
I must confess, it still feels a little like bomb disposal.
-
This should be exactly what you need: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663427
-
Hi Bob,
Thanks for the suggestion/question. I'm thinking about that, but wouldn't putting some robots do not crawl text on pages already indexed be a little like closing the barn door after the horses left? Do you think it would un-index the already crawled sub-domain? Thanks!
-
Assuming that you do not need the development environments indexed in Google, why not simply block all crawlers on those subdomains?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rel=canonical Question
Alright, so let's say we've got an event coming up. The URL is website.com/event. On that page, you can access very small pages with small amounts of information, like website.com/event/register, website.com/event/hotel-info, and website.com/event/schedule. These originally came up as having missing meta descriptions, and I was thinking a rel=canonical might be the best approach, but I'm not sure. What do you think? Is there a better approach? Should I have just added a meta description and moved on?
Intermediate & Advanced SEO | | MWillner0 -
Duplicate URLs ending with #!
Hi guys, Does anyone know why a site can contain duplicate URLs ending with hastag & exclamation mark e.g. https://site.com.au/#! We are finding a lot of these URLs (as duplicates) and i was wondering what they are from developer standpoint? And do you think it's worth the time and effort adding a rel canonical tag or 301 to these URLs eventhough they're not getting indexed by Google? Cheers, Chris
Intermediate & Advanced SEO | | jayoliverwright0 -
Duplicate Pages #!
Hi guys, Currently have duplicate pages accross a website e.g. https://archierose.com.au/shop/cart**#!** https://archierose.com.au/shop/cart The only difference is the URL 1 has a hashtag and exclamation tag. Everything else is the same. We were thinking of adding rel canonical tags on the #! versions of the page to the correct URLs. But Google doens't seem to be indexing the #! versions anyway. Does anyone know why this is the case? If Google is not indexing them, is there any point adding rel canonical tags? Cheers, Chris https://archierose.com.au/shop/cart#!
Intermediate & Advanced SEO | | jayoliverwright0 -
Should I use a rel=canonical to the home page
Hi guys, I have a site where the homepage is ranking for the term 'industrial flooring' around position 30 and the actual level 2 industrial flooring page is ranking well below at around position 60. I'm happy for the homepage to rank for this term and would like to see it improve, so here are my questions: 1: Is the existence of the level 2 page preventing the homepage from ranking higher due to keyword cannibalization etc.? 2: Would the use of the rel=canonical tag pointing from the level 2 page to the home page have a positive or negative impact on the homepage's rankings for 'industrial flooring'? 3: Is there anything else I'm missing? Greatly appreciated.
Intermediate & Advanced SEO | | Blaze-Communication0 -
How do I get rel='canonical' to eliminate the trailing slash on my home page??
I have been searching high and low. Please help if you can, and thank you if you spend the time reading this. I think this issue may be affecting most pages. SUMMARY: I want to eliminate the trailing slash that is appended to my website. SPECIFIC ISSUE: I want www.threewaystoharems.com to showing up to users and search engines without the trailing slash but try as I might it shows up like www.threewaystoharems.com/ which is the canonical link. WHY? and I'm concerned my back-links to the link without the trailing slash will not be recognized but most people are going to backlink me without a trailing slash. I don't want to loose linkjuice from the people and the search engines not being in consensus about what my page address is. THINGS I"VE TRIED: (1) I've gone in my wordpress settings under permalinks and tried to specify no trailing slash. I can do this here but not for the home page. (2) I've tried using the SEO by yoast to set the canonical page. This would work if I had a static front page, but my front page is of blog posts and so there is no advanced page settings to set the canonical tag. (3) I'd like to just find the source code of the home page, but because it is CSS, I don't know where to find the reference. I have gone into the css files of my wordpress theme looking in header and index and everywhere else looking for a specification of what the canonical page is. I am not able to find it. I'm thinking it is actually specified in the .htaccess file. (4) Went into cpanel file manager looking for files that contain Canonical. I only found a file called canonical.php . the only thing that seemed like it was worth changing was changing line 139 from $redirect_url = home_url('/'); to $redirect_url = home_url(''); nothing happened. I'm thinking it is actually specified in the .htaccess file. (5) I have gone through the .htaccess file and put thes 4 lines at the top (didn't redirect or create the proper canonical link) and then at the bottom of the file (also didn't redirect or create the proper canonical link) : RewriteEngine on
Intermediate & Advanced SEO | | Dillman
RewriteCond %{HTTP_HOST} ^([a-z.]+)?threewaystoharems.com$ [NC]
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule .? http://www.%1threewaystoharems.com%{REQUEST_URI} [R=301,L] Please help friends.0 -
Does Google crawl and spider for other links in rel=canonical pages?
When you add rel=canonical to the page, will Google still crawl your page for content and discover new links in that page?
Intermediate & Advanced SEO | | ReferralCandy0 -
Canonical Tags?
I read that Google will "honor" these tags if your website has two url's with duplicate content. The duplicate content does not show up in my SEOmoz crawls report but they do in the search engines and many of "non authoritative links" that are generated from my search feature j(ugly url's with % ...not real user friendly) are ranking higher than the "good URL" links. So if I do the canonical tags I guess my higher ranking bad urls will drop. I even read that google might even completely overlook the links. I read somewhere that the best way to do this is with a 301 redirect...is that correct? I m ranking pretty good with my main keyword terms so I am afraid to make changes not knowing the effect. Any suggestions? Thanks, Boo
Intermediate & Advanced SEO | | Boodreaux0 -
Use rel=canonical to save otherwise squandered link juice?
Oftentimes my site has content which I'm not really interested in having included in search engine results. Examples might be a "view cart" or "checkout" page, or old products in the catalog that are no longer available in our system. In the past, I'd blocked those pages from being indexed by using robots.txt or nofollowed links. However, it seems like there is potential link juice that's being lost by removing these from search engine indexes. What if, instead of keeping these pages out of the index completely, I use to reference the home page (http://www.mydomain.com) of the business? That way, even if the pages I don't care about accumulate a few links around the Internet, I'll be capturing the link juice behind the scenes without impacting the customer experience as they browse our site. Is there any downside of doing this, or am I missing any potential reasons why this wouldn't work as expected?
Intermediate & Advanced SEO | | cadenzajon1