Development site crawled
-
We just found out our password protected development site has been crawled. We are worried about duplicate content - what are the best steps to take to correct this beyond adding to robots.txt?
-
Unfortunately, robots.txt won't prevent your site from being crawled and indexed if there is a link from an external site pointing to yours. What you need to do is use
on all your development pages. I don't know how big your site is, so this may or may not be a lot of work. Do this, then after the next Google crawl, your pages will be dropped from the SERPs.
-
Thanks Stephen & Kyle! We had the site behind a login, so we're not sure how this happened. Any idea?
-
Put the site behind a login
-
Oops! That sounds unfortunate, Marcy. How did that happen?
Once you have added the correct rules to the robots.txt - I'm guessing you're using "Disallow: /" - you can request, if your development site is registered in Google Webmaster Tools, that Google remove the site from its index.
www.google.com/webmasters/tools/url-removal
Hope that helps,
K
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google crawling 200 page site thousands of times/day. Why?
Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day. I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief. So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks! *edited for grammar.
Intermediate & Advanced SEO | | brettmandoes0 -
ScreamingFrog won't crawl my site.
Hey guys, My site is Netspiren.dk and when I use a tool like Screaming Frog or Integrity, it only crawls my homepage and menu's - not product-pages. Examples
Intermediate & Advanced SEO | | FrederikTrovatten22
A menu: http://www.netspiren.dk/pl/Helse-Kosttilskud-Blandingsolie_57699.aspx
A product: http://www.netspiren.dk/pi/All-Omega-3-6-9-180-kapsler_1412956_57699.aspx Is it because the products are being loaded in Javascript?
What's your recommendation? All best,
Fred.0 -
Site redesign..have I done everything?
Hello, We have a site that was recently put through the redesign process-a couple of weeks ago. It was a tired site that was optimized well, but still struggled because it was so outdated. I went ahead and re-optimized, submitted a new sitemap, and did the fetch. Have I missed a step? Could someone offer insight into what they do when a site is redesigned and the steps taken to make sure that Google crawls and "appreciates" 🙂 the new site as soon as possible? Thanks in advance for any and all help!
Intermediate & Advanced SEO | | lfrazer0 -
Site wide links Concept
Hi All, All type of site wide links are bad for Google or it depends upon other factors as well? For example if you talk about GoDaddy or any other service provider company they put their links on the footer of other websites so in this condition, Google will harm their rankings or not? Also elaborate the best practices for site wide links.
Intermediate & Advanced SEO | | RuchiPardal0 -
Temporary Duplicate Sites - Do anything?
Hi Mozzers - We are about to move one of our sites to Joomla. This is one of our main sites and it receives about 40 million visits a month, so the dev team is a little concerned about how the new site will handle the load. Dev's solution, since we control about 2/3 of that traffic through our own internal email and cross promotions, is to launch the new site and not take down the old site. They would leave the old site on its current URL and make the new site something like new.sub.site.com. Traffic we control would continue to the old site, traffic that we detect as new would be re-directed to the new site. Over time (the think about 3-4 months) they would shift the traffic all to the new site, then eventually change the URL of the new site to be the URL of the old site and be done. So this seems to be at the outset a duplicate content (whole site) issue to start with. I think the best course of action is try to preserve all SEO value on the old URL since the new URL will eventually go away and become the old URL. I could consider on the new site no-crawl/no-index tags temporarily while both sites exist, but would that be risky since that site will eventually need to take those tags off and become the only site? Rel=canonical temporarily from the new site to the old site also seems like it might not be the best answer. Any thoughts?
Intermediate & Advanced SEO | | Kenn_Gold0 -
Unable to Crawl my Website
Hi all, I have a website that I am trying to promote, but tried to add it here in SEOMoz and got the following message: We have detected that the root domain evolving-networks.co.uk does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information. Does anyone know why this website cannot be crawled? Please help. Thank you in advance!
Intermediate & Advanced SEO | | LSDigital0 -
Has my site been penalized?
Our site was listed on the first page for the phrase Active SEO on Google.co.uk. We suddenly find ourselves on page 4 overnight and we're not sure what's going on. We have not undertaken an Black hat techniques however the site is fairly new. Anyone have any ideas as to what is going on?
Intermediate & Advanced SEO | | MassivePrime0 -
Crawl questions
My first website crawl indicating many issues. I corrected the issues, requested another crawl and received the results. After viewing the excel file I have some questions. 1. There are many pages with missing Titles and Meta Descriptions in the Excel file. An example is http://www.terapvp.com/threads/help-us-decide-on-terapvp-com-logo.25/page-2 That page clearly has a meta description and title. It is a forum thread. My forum software does a solid job of always providing those tags. Why would my crawl report not show this information? This occurs on numerous pages. 2. I believe all my canonical URLs are properly set. My crawl report has 3k+ records, largely due to there being 10 records for many pages. These extra records are various sort orders and style differences for the same page i.e. ?direction=asc. My need for a crawl report is to provide actionable data so I can easily make SEO improvements to my site where necessary. These extra records don't provide any benefit. IF the crawl report determined there was not a clear canonical URL, then I could understand. But that is not the case. An example is http://www.terapvp.com/forums/news/ If you look at the source you will clearly see Where is the benefit to including the 10 other records in the Crawl report which show this same page in various sort orders? Am I missing anything? 3. My robots.txt appropriately blocks many pages that I do not wish to be crawled. What is the benefit to including these many pages in the crawl report? Perhaps I am over analyzing this report. I have read many articles on SEO, but now that I have found SEOmoz, I can see I will need to "unlearn what I have learned". Many things such as setting meta keyword tags are clearly not helpful. I wish to focus my energy and I was looking to the crawl report as my starting point. Either I am missing something, or the report design needs improvement.
Intermediate & Advanced SEO | | RyanKent0