Can you be penalized by a development server with duplicate content?
-
I developed a site for another company late last year and after a few months of seo done by them they were getting good rankings for hundreds of keywords. When penguin hit they seemed to benefit and had many top 3 rankings.
Then their rankings dropped one day early May. Site is still indexed and they still rank for their domain. After some digging they found the development server had a copy of the site (not 100% duplicate). We neglected to hide the site from the crawlers, although there were no links built and we hadn't done any optimization like meta descriptions etc.
The company was justifiably upset. We contacted Google and let them know the site should not have been indexed, and asked they reconsider any penalties that may have been placed on the original site. We have not heard back from them as yet.
I am wondering if this really was the cause of the penalty though. Here are a few more facts:
Rankings built during late March / April on an aged domain with a site that went live in December.
Between April 14-16 they lost about 250 links, mostly from one domain. They acquired those links about a month before.
They went from 0 to 1130 links between Dec and April, then back to around 870 currently
According to ahrefs.com they went from 5 ranked keywords in March to 200 in April to 800 in May, now down to 500 and dropping (I believe their data lags by at least a couple of weeks).
So the bottom line is this site appeared to have suddenly ranked well for about a month then got hit with a penalty and are not in top 10 pages for most keywords anymore.
I would love to hear any opinions on whether a duplicate site that had no links could be the cause of this penalty? I have read there is no such thing as a duplicate content penalty per se. I am of the (amateur) opinion that it may have had more to do with the quick sudden rise in the rankings triggering something.
Thanks in advance.
-
What kind of links they lost, what was that domain? If it was like 250 links form one domain for one month, Google could think that they were paid and that could get you penalty. Buying links is a risky business these days.
-
I have experience of this. And it wasn't a nice!
I created a test copy of a site (WordPress) that I work on with a friend. It had been ranking pretty well mainly though lots of quality curated content, plus a bit of low level link building. The link building had slowed in late 2010.
Within 12 hours of the test version of the site going 'live' (it was set to no-index in WP options, which I no longer trust) the live site rankings and traffic tanked. The test version was on a sub-domain, and was an exact replica of the live site. With no known links, it was somehow picked up by Google and all 400 or so pages where in the Gindex along with the live site. Three re-consideration requests and 6 months later, we got back to where we were. The offending sub domain was 301'd to the live site within minutes of inding the problem, and during the 6 month bad period all other causes were ruled out.
I now password protect any staging sites that are on the internet, just to be safe!
-
I would not worry at all, there is no duplicate copntent penalty for this sort of thing, al that will happen is one site will rank one will not. The original site with the links will obviously be se as the site to rank, block off the deve site anyhow if you are worried. but this seems like a deeper problem that a bit of duplicate content
-
Yes. It should always be practice to noindex any vhost on the development and staging servers.
Not only will duplicate content harm them, but in one personal case of mine, the staging server was outranking the client for their own keywords! Obviously Google was confused and didn't know which page to show in SERPs. In turn this confuses visitors and leads to some angry customers.
Lastly, having open access to your staging server is a security risk for a number of reasons. It's not so serious that you need to require a login, but you should definitely keep staging sites out of SERPs to prevent others from getting easy access to them.
For comparison, the example I gave where the staging server outranked the client, the client had a great SEO campaign and the staging server had several insignificant links by accident. So the link building contest doesn't always apply in this case.
-
While I have no experience with this specifically with regards to SEO and ranking, I do have a development server. If you don't mind me asking, why is your development server public? Usually they should be behind some kind of password and not accessible by search spiders.
If you are worried that that is the problem, just make the entire site noindex and that should get it out of google eventually. It may take some time however.
Good luck.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content across different domains in different countries?
Hi Guys, We have a 4 sites One in NZ, UK, Canada and Australia. All geo-targeting their respective countries in Google Search Console. The sites are identical. We recently added the same content to all 4 sites. Will this cause duplicate content issues or any issues even though they are in different countries and geo-targeting is set? Cheers.
Intermediate & Advanced SEO | | wickstar0 -
Duplicate Page Content Issues Reported in Moz Crawl Report
Hi all, We have a lot of 'Duplicate Page Content' issues being reported on the Moz Crawl Report and I am trying to 'get to the bottom' of why they are deemed as errors... This page; http://www.bolsovercruiseclub.com/about-us/job-opportunities/ has (admittedly) very little content and is duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/explorer-of-the-seas-2015/ This page is basically an image and has just a couple of lines of static content. Also duplicated with; http://www.bolsovercruiseclub.com/cruise-lines/costa-cruises/costa-voyager/ This page relates to a single cruise ship and again has minimal content... Also duplicated with; http://www.bolsovercruiseclub.com/faq/packing/ This is an FAQ page again with only a few lines of content... Also duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/exclusive-canada-&-alaska-cruisetour/ Another page that just features an image and NO content... Also duplicated with; http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/free-upgrades-on-cunard-2014-&-2015/?page_number=6 A cruise deals page that has a little bit of static content and a lot of dynamic content (which I suspect isn't crawled) So my question is, is the duplicate content issued caused by the fact that each page has 'thin' or no content? If that is the case then I assume the simple fix is to increase add \ increase the content? I realise that I may have answered my own question but my brain is 'pickled' at the moment and so I guess I am just seeking assurances! 🙂 Thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
Robots.txt & Duplicate Content
In reviewing my crawl results I have 5666 pages of duplicate content. I believe this is because many of the indexed pages are just different ways to get to the same content. There is one primary culprit. It's a series of URL's related to CatalogSearch - for example; http://www.careerbags.com/catalogsearch/result/index/?q=Mobile I have 10074 of those links indexed according to my MOZ crawl. Of those 5349 are tagged as duplicate content. Another 4725 are not. Here are some additional sample links: http://www.careerbags.com/catalogsearch/result/index/?dir=desc&order=relevance&p=2&q=Amy
Intermediate & Advanced SEO | | Careerbags
http://www.careerbags.com/catalogsearch/result/index/?color=28&q=bellemonde
http://www.careerbags.com/catalogsearch/result/index/?cat=9&color=241&dir=asc&order=relevance&q=baggallini All of these links are just different ways of searching through our product catalog. My question is should we disallow - catalogsearch via the robots file? Are these links doing more harm than good?0 -
How do I best handle Duplicate Content on an IIS site using 301 redirects?
The crawl report for a site indicates the existence of both www and non-www content, which I am aware is duplicate. However, only the www pages are indexed**, which is throwing me off. There are not any 'no-index' tags on the non-www pages and nothing in robots.txt and I can't find a sitemap. I believe a 301 redirect from the non-www pages is what is in order. Is this accurate? I believe the site is built using asp.net on IIS as the pages end in .asp. (not very familiar to me) There are multiple versions of the homepage, including 'index.html' and 'default.asp.' Meta refresh tags are being used to point to 'default.asp'. What has been done: 1. I set the preferred domain to 'www' in Google's Webmaster Tools, as most links already point to www. 2. The Wordpress blog which sits in a /blog subdirectory has been set with rel="canonical" to point to the www version. What I have asked the programmer to do: 1. Add 301 redirects from the non-www pages to the www pages. 2. Set all versions of the homepage to redirect to www.site.org using 301 redirects as opposed to meta refresh tags. Have all bases been covered correctly? One more concern: I notice the canonical tags in the source code of the blog use a trailing slash - will this create a problem of inconsistency? (And why is rel="canonical" the standard for Wordpress SEO plugins while 301 redirects are preferred for SEO?) Thanks a million! **To clarify regarding the indexation of non-www pages: A search for 'site:site.org -inurl:www' returns only 7 pages without www which are all blog pages without content (Code 200, not 404 - maybe deleted or moved - which is perhaps another 301 redirect issue).
Intermediate & Advanced SEO | | kimmiedawn0 -
Why are these pages considered duplicate content?
I have a duplicate content warning in our PRO account (well several really) but I can't figure out WHY these pages are considered duplicate content. They have different H1 headers, different sidebar links, and while a couple are relatively scant as far as content (so I might believe those could be seen as duplicate), the others seem to have a substantial amount of content that is different. It is a little perplexing. Can anyone help me figure this out? Here are some of the pages that are showing as duplicate: http://www.downpour.com/catalogsearch/advanced/byNarrator/narrator/Seth+Green/?bioid=5554 http://www.downpour.com/catalogsearch/advanced/byAuthor/author/Solomon+Northup/?bioid=11758 http://www.downpour.com/catalogsearch/advanced/byNarrator/?mediatype=audio+books&bioid=3665 http://www.downpour.com/catalogsearch/advanced/byAuthor/author/Marcus+Rediker/?bioid=10145 http://www.downpour.com/catalogsearch/advanced/byNarrator/narrator/Robin+Miles/?bioid=2075
Intermediate & Advanced SEO | | DownPour0 -
Duplicate content resulting from js redirect?
I recently created a cname (e.g. m.client-site .com) and added some js (supplied by mobile site vendor to the head which is designed to detect if the user agent is a mobi device or not. This is part of the js: var CurrentUrl = location.href var noredirect = document.location.search; if (noredirect.indexOf("no_redirect=true") < 0){ if ((navigator.userAgent.match(/(iPhone|iPod|BlackBerry|Android.*Mobile|webOS|Window Now... Webmaster Tools is indicating 2 url versions for each page on the site - for example: 1.) /content-page.html 2.) /content-page.html?no_redirect=true and resulting in duplicate page titles and meta descriptions. I am not quite adept enough at either js or htaccess to really grasp what's going on here... so an explanation of why this is occurring and how to deal with it would be appreciated!
Intermediate & Advanced SEO | | SCW0 -
Removing Duplicate Content Issues in an Ecommerce Store
Hi All OK i have an ecommerce store and there is a load of duplicate content which is pretty much the norm with ecommerce store setups e.g. this is my problem http://www.mystoreexample.com/product1.html
Intermediate & Advanced SEO | | ChriSEOcouk
http://www.mystoreexample.com/brandname/product1.html
http://www.mystoreexample.com/appliancetype/product1.html
http://www.mystoreexample.com/brandname/appliancetype/product1.html
http://www.mystoreexample.com/appliancetype/brandname/product1.html so all the above lead to the same product
I also want to keep the breadcrumb path to the product Here's my plan Add a canonical URL to the product page
e.g. http://www.mystoreexample.com/product1.html
This way i have a short product URL Noindex all duplicate pages but do follow the internal links so the pages are spidered What are the other options available and recommended? Does that make sense?
Is this what most people are doing to remove duplicate content pages? thanks 🙂0 -
Should I robots block site directories with primarily duplicate content?
Our site, CareerBliss.com, primarily offers unique content in the form of company reviews and exclusive salary information. As a means of driving revenue, we also have a lot of job listings in ouir /jobs/ directory, as well as educational resources (/career-tools/education/) in our. The bulk of this information are feeds, which exist on other websites (duplicate). Does it make sense to go ahead and robots block these portions of our site? My thinking is in doing so, it will help reallocate our site authority helping the /salary/ and /company-reviews/ pages rank higher, and this is where most of the people are finding our site via search anyways. ie. http://www.careerbliss.com/jobs/cisco-systems-jobs-812156/ http://www.careerbliss.com/jobs/jobs-near-you/?l=irvine%2c+ca&landing=true http://www.careerbliss.com/career-tools/education/education-teaching-category-5/
Intermediate & Advanced SEO | | CareerBliss0