Duplicate site (disaster recovery) being crawled and creating two indexed search results
-
I have a primary domain, toptable.co.uk, and a disaster recovery site for this primary domain named uk-www.gtm.opentable.com. In the event of a disaster, toptable.co.uk would get CNAMEd (DNS alias) to the .gtm site. Naturally the .gtm disaster recover domian is an exact match to the toptable.co.uk domain.
Unfortunately, Google has crawled the uk-www.gtm.opentable site, and it's showing up in search results. In most cases the gtm urls don't get redirected to toptable they actually appear as an entirely separate domain to the user. The strong feeling is that this duplicate content is hurting toptable.co.uk, especially as .gtm.ot is part of the .opentable.com domain which has significant authority. So we need a way of stopping Google from crawling gtm.
There seem to be two potential fixes. Which is best for this case?
- use the robots.txt to block Google from crawling the .gtm site
2) canonicalize the the gtm urls to toptable.co.uk
In general Google seems to recommend a canonical change but in this special case it seems robot.txt change could be best.
Thanks in advance to the SEOmoz community!
-
It's a little tricky. While Andrea is right about Robots.txt - it's not great for removal once pages/domains are indexed, you can block the sub-domain with robots.txt and then request removal in Google Webmaster Tools (you need to create a separate account for the sub-domain itself). That's often the fastest way to remove something from the index, and if it has no search value, I might go that route. Just proceed with caution - it's a delicate procedure.
Doing 1-to-1 canonicalization or adding 301 redirects may be the next strongest signal (NOINDEX is a bit weaker, IMO). However, Google will have to re-crawl the sub-domain to do that, so you'll need to keep the paths open.
-
First, if the pages are already indexed then a robots.txt won't make them go away. A meta tag no index on the pages is the better solution. This allows search engines to "read" you page, see the no index tag and then work to remove the pages from index. A robots.txt doesn't necessarily accomplish the same result.
-
If you can do a 1-to-1 page canonicalization (each page on .co.uk is canonicaled to the equivalent page on the .com) then I would do that.
Otherwise, I would noindex the backup site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Creating two websites from one and building up traffic to the new domain quickly
A client has an existing successful website that sells niche products - they are well known in their marketplace. They have two sets of key customers, let's call them (a) and (b), that need addressing in different ways to maximise sales. (a) is the more specialist end of the market, where people have complex needs - there are fewer of them but repeat business is likely, and we can talk to them in more technical language. (b) is the layman's end of the market - there is a vast pool of potential customers but they'll be more casual buyers and need to be addressed more in layman's terms. So what they want to do is to take their existing website, and essentially split it into two different websites, one for each market. The one that will use the existing domain, with all the links that have built up over the years pointing to it, will be the site for the more specialist end of the market (a). The domain name suits it better, which is why he wants to use the existing domain with that site and not the other. (b) will be a brand new domain. The client will write new product descriptions across the board so that the two sets of product information are not duplicate. I'd rather he didn't do this at all, because of the risk involved, and the difficulty of building up the traffic to the new site, which is after all the one with the best chance of mass market sales. But given that the client has decided that this is definitely what he wants, does anyone have any thoughts on what the action plan should be?
Intermediate & Advanced SEO | | helga730 -
How to Get Rid of Dates Shown In Google Search Results
When I enter "Site: URL" to check what a search how Google displays search result, a date appears at the very front. This takes away several characters, really valuable real estate. How can I stop Google from displaying these dates? There are certain Wordpress plugins like "WP Date Remover" however the seem to only apply to blog posts. Dates are appearing on results on all my Wordpress pages. Is there an internal setting in Wordpress that will allow me to remove dates for these non blogpost pages?
Intermediate & Advanced SEO | | Kingalan11 -
In Google Search Results ....Is it a site link or what? How to get this?
Hello Experts, When I search in google any keyword like abcd in search results for one website after meta description there are showing few links of website ( image attached ) Can you please let me know what is this & how to achieve such type of links? Thanks! mdJBLYb
Intermediate & Advanced SEO | | wright3350 -
Indexation of internal search results from infinite scroll
Hello, I have an issue where we will have a website set up with dynamic (AJAX) result pages based on the selection of certain filters chosen by the user. The result page will have 12 results shown and if the user scrolls down, the page will lazy load (infinite scroll) additional results. So for example, with these filters: Filter A: Size Filter B: Color Filter 😄 Location We could potentially have a page for "Large, Blue, New York" results dynamically generated. My issue is that I want Google to potentially crawl and index all these variations, so that I can have a page that ranks for "Large Blue New York", another page that ranks for "Small Orange Miami" etc. However, I do not need all the products indexed--- just the page with the first set of dynamic results would be enough since the additional products would just be more of the same. In other words, I am trying to get these pages with filters applied indexed and not necessarily get every possible product indexed. Can anyone comment on the best way to Get Google to index all dynamic variations? The proper way of paginating pages? Thank you
Intermediate & Advanced SEO | | Digi12340 -
Google Indexing Duplicate URLs : Ignoring Robots & Canonical Tags
Hi Moz Community, We have the following robots command that should prevent URLs with tracking parameters being indexed. Disallow: /*? We have noticed google has started indexing pages that are using tracking parameters. Example below. http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html?ec=affee77a60fe4867 These pages are identified as duplicate content yet have the correct canonical tags: https://www.google.co.uk/search?num=100&site=&source=hp&q=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&oq=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&gs_l=hp.3..0i10j0l9.4201.5461.0.5879.8.8.0.0.0.0.82.376.7.7.0....0...1c.1.58.hp..3.5.268.0.JTW91YEkjh4 With various affiliate feeds available for our site, we effectively have duplicate versions of every page due to the tracking query that Google seems to be willing to index, ignoring both robots rules & canonical tags. Can anyone shed any light onto the situation?
Intermediate & Advanced SEO | | JBGlobalSEO0 -
Recommended e-commerce site search for Magento?
Does anyone have recommendations for any particular site searches for large e-commerce sites based on Magento? Some (hopeful) requirements: Possibility to segment product pages and blog content on results page Doesn't cause any major SEO or technical issues Understands semantic search Ability to filter results Ability to sort (e.g. by price, popularity, new in stock) It'd be really useful to see examples and know if there are any particular issues to be aware of. Thanks. 🙂
Intermediate & Advanced SEO | | Alex-Harford0 -
Why is Google Displaying this image in the search results?
Hi i'm looking at advice on how to remove or change a particular image Google is displaying in the search results. I have attached a screenshot. From the first look of it, i assumed the image would be related and be on the dealers Google+ Local Page: https://plus.google.com/118099386834104087122/about?hl=en But there are no photos. The image seems to be coming from the website. Is there a way to stop Google from displaying this image or making them display a totally different image. Thanks, Chris XzfsnUy.png
Intermediate & Advanced SEO | | Mattcarter080 -
Is there a development solution for AJAX-based sites and indexing in Bing/Yahoo?
Hi. I have outlined a solution for an AJAX-based site in order to rank preserve indexing and rank in Google using the hashbang. I'm curious if anyone has some insight for doing the same for Bing/Yahoo! (a development question)
Intermediate & Advanced SEO | | OveritMedia0