Robots
-
I have just noticed this in my code
name="robots" content="noindex">
And have noticed some of my keywords have dropped, could this be the reason?
-
It was everypage on the site.
I also noticed the pages that are not indexed no longer, they have no PR, is that expected?
-
Was the homepage one of the pages that included the noindex meta tag?
Even if it was, pages will not all be crawled at the same time or in a particular order. The homepage may have already been crawled before the change was made on your site, your homepage may not have even be crawled at all today if it was visited yesterday for example.
Crawling results can vary hugely based on a number of factors.
-
The only thing that does not make sense to me is if the sitemap was processes today, why is the homepage still indexed?
-
Yes because that is what caused them to take notice of the meta noindex and drop your pages from their search results.
Best of luck with it, feel free to send me a PM if your pages haven't reappeared in Google's search engine over the next few days.
-
Oh! I also noticed that in Webmaster tools that the sitemap was processed today, does that mean Googlebot has visited the website today?
-
Thanks Geoff, will do what you recommended.
I noticed in Google webmaster tools this:
Blocked URLs - 193
Downloaded - 13 hours ago
Status - 200 (success)
-
Hi Gary,
If the pages dropped from Google's index that quickly, then chances are, they will be back again almost as quick. If your website has an XML sitemap, you could try pinging this to the search engines to alert them to revisit your site as soon as possible again.
It's bad luck that the meta tag was inserted and this caused immediate negative effects, but it will be recoverable, and likely your pages should re-enter the index at the same positions as they were prior to today.
The key is to just bring Google's bot back to your website as soon as possible to recrawl, publishing a blog post could do this, creating a backlink from a high traffic site (a forum is a good example for this) are some methods of encouraging this.
Hope that helps.
-
Hi Geoff,
The developer had said it got added this morning when we rolled out a discount feature on our website, I think it was the CMS adding it automatically, however now a lot of the keywords that were ranking top 3 are no longer indexed, is it just bad luck? will Google come back?
-
If you are using a content management system, these additional meta tags can often be controlled within your administration panel.
If the meta tag is hard coded into your website header, this will appearing on every page of your website and will subsequently result in you not having any pages indexed in search engines.
As Ben points out, the noindex directive instructs search engine robots not to index that particular page. It would recommended to address this issue as quickly as possible, especially if you have a high traffic website that is getting crawled frequently.
-
Thanks for your quick reply Ben.
It does not seem to be all my pages that have fallen off, just some, the developer said that it only got added this morning by mistake.
I actually typed in the full URL into Google and it does not appear anymore, I was ranked no.2 for that particular keyword, receiving about 150 click per day, not happy!
-
Actually on second thoughts - YES. Yes it probably is the reason your terms are dropping.
-
Could be.
That's a directive that tells search engines no to include that page in their indexes.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt wildcards - the devs had a disagreement - which is correct?
Hi – the lead website developer was assuming that this wildcard: Disallow: /shirts/?* would block URLs including a ? within this directory, and all the subdirectories of this directory that included a “?” The second developer suggested that this wildcard would only block URLs featuring a ? that come immediately after /shirts/ - for example: /shirts?minprice=10&maxprice=20 BUT argued that this robots.txt directive would not block URLS featuring a ? in sub directories - e.g. /shirts/blue?mprice=100&maxp=20 So which of the developers is correct? Beyond that, I assumed that the ? should feature a * on each side of it – for example - /? - to work as intended above? Am I correct in assuming that?
Intermediate & Advanced SEO | | McTaggart0 -
Help with Robots.txt On a Shared Root
Hi, I posted a similar question last week asking about subdomains but a couple of complications have arisen. Two different websites I am looking after share the same root domain which means that they will have to share the same robots.txt. Does anybody have suggestions to separate the two on the same file without complications? It's a tricky one. Thank you in advance.
Intermediate & Advanced SEO | | Whittie0 -
Robots.txt & Duplicate Content
In reviewing my crawl results I have 5666 pages of duplicate content. I believe this is because many of the indexed pages are just different ways to get to the same content. There is one primary culprit. It's a series of URL's related to CatalogSearch - for example; http://www.careerbags.com/catalogsearch/result/index/?q=Mobile I have 10074 of those links indexed according to my MOZ crawl. Of those 5349 are tagged as duplicate content. Another 4725 are not. Here are some additional sample links: http://www.careerbags.com/catalogsearch/result/index/?dir=desc&order=relevance&p=2&q=Amy
Intermediate & Advanced SEO | | Careerbags
http://www.careerbags.com/catalogsearch/result/index/?color=28&q=bellemonde
http://www.careerbags.com/catalogsearch/result/index/?cat=9&color=241&dir=asc&order=relevance&q=baggallini All of these links are just different ways of searching through our product catalog. My question is should we disallow - catalogsearch via the robots file? Are these links doing more harm than good?0 -
Why are these results being showed as blocked by robots.txt?
If you perform this search, you'll see all m. results are blocked by robots.txt: http://goo.gl/PRrlI, but when I reviewed the robots.txt file: http://goo.gl/Hly28, I didn't see anything specifying to block crawlers from these pages. Any ideas why these are showing as blocked?
Intermediate & Advanced SEO | | nicole.healthline0 -
If i disallow unfriendly URL via robots.txt, will its friendly counterpart still be indexed?
Our not-so-lovely CMS loves to render pages regardless of the URL structure, just as long as the page name itself is correct. For example, it will render the following as the same page: example.com/123.html example.com/dumb/123.html example.com/really/dumb/duplicative/URL/123.html To help combat this, we are creating mod rewrites with friendly urls, so all of the above would simply render as example.com/123 I understand robots.txt respects the wildcard (*), so I was considering adding this to our robots.txt: Disallow: */123.html If I move forward, will this block all of the potential permutations of the directories preceding 123.html yet not block our friendly example.com/123? Oh, and yes, we do use the canonical tag religiously - we're just mucking with the robots.txt as an added safety net.
Intermediate & Advanced SEO | | mrwestern0 -
Googlebot Can't Access My Sites After I Repair My Robots File
Hello Mozzers, A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file' My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows: User-agent: * Disallow: /cgi-bin/ After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage. I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period. Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'? Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!
Intermediate & Advanced SEO | | NiallSmith1 -
Using 2 wildcards in the robots.txt file
I have a URL string which I don't want to be indexed. it includes the characters _Q1 ni the middle of the string. So in the robots.txt can I use 2 wildcards in the string to take out all of the URLs with that in it? So something like /_Q1. Will that pickup and block every URL with those characters in the string? Also, this is not directly of the root, but in a secondary directory, so .com/.../_Q1. So do I have to format the robots.txt as //_Q1* as it will be in the second folder or just using /_Q1 will pickup everything no matter what folder it is on? Thanks.
Intermediate & Advanced SEO | | seo1234560 -
Does using robots.txt to block pages decrease search traffic?
I know you can use robots.txt to tell search engines not to spend their resources crawling certain pages. So, if you have a section of your website that is good content, but is never updated, and you want the search engines to index new content faster, would it work to block the good, un-changed content with robots.txt? Would this content loose any search traffic if it were blocked by robots.txt? Does anyone have any available case studies?
Intermediate & Advanced SEO | | nicole.healthline0