Indexed, though blocked by robots.txt: Need to bother?
-
Hi,
We have intentionally blocked some of the website files which were indexed for years. Now we receive a message "Indexed, though blocked by robots.txt" in GSC. We can ignore as per my knowledge? Are any actions required about this? We thought of blocking them with meta tags but these are PDF files.
Thanks
-
Hi there!
What Google is telling you is that you are indexing URLs that you probably are not wanting to be indexed, or the other way around, that important pages are being blocked but indexed for other reasons.
If I might ask, why did you blocked through robots.txt those files?
There most 2 answers are:
1- Wanted to remove those from search results. If this is your case, you've solved only a part of the problem. What you should have done is (previously allowing robots to crawl those urls) apply noindex rules (keep in mind that can be set up in the HTTP header, as long as not html files cant have meta robots tag), then after a sufficient time block them in robots.txt.
_2- Optimize how GoogleBot (crawiling) time. _Being this case, then you've done it correctly and there is nothing to worry.Hope this help.
Best luck.
GR
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Only half of the sitemap is indexed
I have a website with high domain authority and high quality content and blog. I've resubmitted the sitemap half a dozen times. Search console getr half way through and then stops. Does anyone know any reason for this? I've seen the usual responses of 'google is not obligated to crawl you' but this site has been fully crawled in the past. It's very odd Does anyone have any ideas why it might stop half way - or does anyone know a testing tool that might illuminate the situation?
Algorithm Updates | | Andrew-SEO0 -
Duplicate website pages indexed: Ranking dropped. Does Google checks the duplicate domain association?
Hi all, Our duplicate website which is used for testing new optimisations got indexed and we dropped in rankings. But I am not sure whether this is exact reason as it happened earlier too where I don't find much drop in rankings. Also I got replies in the past that it'll not really impact original website but duplicate website. I think this rule applies to the third party websites. But if our own domain has exact duplicate content; will Google knows that we own the website from any other way we are associated like IP addresses and servers, etc..to find the duplicate website is hosted by us? I wonder how Google treats duplicate content from third party domains and own domains. Thanks
Algorithm Updates | | vtmoz0 -
An Educated Eye Needed
As a small video production company, we rely on word of mouth and good Internet placement to generate business. From first glance, what would cause potential customers/search engines to rank our site low? https://episode11productions.com We believe that we have done all that we "know" to do, and are now at a loss.
Algorithm Updates | | e11productions1 -
Selection of the Right Keywords - Some insights needed!
I have recently begun with my content and keyword selections. I used the Adword's keyword tool and for eg: got a few keywords like laser skin treatment dermatology dermatologists cosmetic laser surgery The competition for these are low and medium. Now what I understand is. If I wish to use them in my articles on content generation I can have them as medium and long tail keywords to write around. So for eg: laser skin treatment = " Benefits of a laser skin treatment in India" and the url for this article could be /laser-skin-treatment For dermatology = "best dermatology practices" url could be /dermatology-practices Do the above happen to be the medium and long tail keywords? Am I going in the correct direction. How do I judge and come out with medium and long tail keywords. Please suggest Thanks
Algorithm Updates | | shanky10 -
When Google crawls and indexes a new page does it show up immediately in Google search - "site;"?
We made changes to a site, including the addition of a new page and corresponding link/text changes to existing pages. The changes are not yet showing up in the Google index (“site:”/cache), but, approximately 24 hours after making the changes, The SERP's for this site jumped up. We obtained a new back link about a couple of weeks ago, but it is not yet showing up in OSE, Webmaster Tools, or other tools. Just wondering if you think the Google SERP changes run ahead of what they actually show us in site: or cache updates. Has Google made a significant SERP “adjustment” recently? Thanks.
Algorithm Updates | | richpalpine0 -
Removing secure subdomain from google index
we've noticed over the last few months that Google is not honoring our main website's robots.txt file. We have added rules to disallow secure pages such as: Disallow: /login.cgis Disallow: /logout.cgis Disallow: /password.cgis Disallow: /customer/* We have noticed that google is crawling these secure pages and then duplicating our complete ecommerce website across our secure subdomain in the google index (duplicate content) https://secure.domain.com/etc. Our webmaster recently implemented a specific robots.txt file for the secure subdomain disallow all however, these duplicated secure pages remain in the index. User-agent: *
Algorithm Updates | | marketing_zoovy.com
Disallow: / My question is should i request Google to remove these secure urls through Google Webmaster Tools? If so, is there any potential risk to my main ecommerce website? We have 8,700 pages currently indexed into google and would not want to risk any ill effects to our website. How would I submit this request in the URL Removal tools specifically? would inputting https://secure.domain.com/ cover all of the urls? We do not want any secure pages being indexed to the index and all secure pages are served on the secure.domain example. Please private message me for specific details if you'd like to see an example. Thank you,0 -
Index Page lost rankings? Please Help!
This morning I ranked highly (Page 1 UK Google) for over 50 keyword search terms for my website http://www.careworx.co.uk This afternoon my rankings have bottomed out and dropped pages? I have not been de-indexed it appears and many of my sub-pages are still highly ranked. Would anybody know what has happened? I know of Google Panda but I would've seen results drop before now so I'm very concerned. Don't seem to have lost any links etc and am careful to balance SEO with a mix of techniques to keep Google happy and again, have not been de-indexed. Can anybody offer advice please, or let me know how I can rectify this.
Algorithm Updates | | andystep0