Have you ever seen or experienced a page indexed which is actually from a website which is blocked by robots.txt?
-
Hi all,
We use robots file and meta robots tags for blocking website or website pages to block bots from crawling. Mostly robots.txt will be used for website and expect all the pages to not getting indexed. But there is a condition here that any page from website can be indexed by Google even the site is blocked from robots.txt; because crawler may find the page link somewhere on internet as stated here at last paragraph. I wonder if this really the case where some webpages have got indexed.
And even we use meta tags at page level; do we need to block from robots.txt file? Can we use both techniques at a time?
Thanks
-
Hi vtmoz,
The most mandatory way to prevent any page to be indexed is by using a meta robots tag with a _noindex _parameter.
Then using robots.txt will help to optimize your server resources and is a way that prevent google to crawl any new page that do not have the meta robots tag.And yeah, its very common to have indexed pages even the robots.txt file blocks the entire website.
If what you are looking for is to remove from index the pages, follow this steps:
- Allow the whole website to be crawable (or at least that specific pages/section) in the robots.txt
- add the robots meta tag with "noindex,follow" parametres
- wait several weeks, 6 to 8 weeks is a fairly good time. Or just do a followup on those pages
- when you got the results (all your desired pages to be de-indexed) re-block with robots.txt those pages
- DO NOT erase the meta robots tag.
Hope it helps.
Best luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Tens of duplicate homepages indexed and blocked later: How to remove from Google cache?
Hi community, Due to some WP plugin issue, many homepages indexed in Google with anonymous URLs. We blocked them later. Still they are in SERP. I wonder whether these are causing some trouble to our website, especially as our exact homepages indexed. How to remove these pages from Google cache? Is that the right approach? Thanks
Algorithm Updates | | vtmoz0 -
Ranking drop after image compression across website.
Hi all, Just checked my website in Google pagespeed insights and most of our website pages were required to reduce the images file size for better page loading. So I have compressed the images using https://compressor.io/ and https://tinypng.com/ and replaced the images. Then surprisingly ranking dropped even score improved for all pages with image optimisation. What would be the reason? Thanks
Algorithm Updates | | vtmoz0 -
Best length for a video on a website
Most of us deal with UI/UX questions and SEO questions from clients on a daily basis. I was discussing video length with a client recently and I realized that he was in his video. This made me think about the thrill of seeing yourself in video might cause someone to make their video longer at the expense of UX. So, I thought I would put it to the Moz community. If a company is doing a "typical" home page "Explainer" video that tells about a company. This can be in the B2B or B2C sectors. I want to withhold my opinion at this point for the discussion.
Algorithm Updates | | RobertFisher0 -
De-indexed homepage in Google - very confusing.
A website I provide content for has just suffered a de-indexed homepage in Google (not in any of the other search engines) - all the other pages remained indexed as usual. Client asked me what might be the problem and I just couldn't figure it out - no linkbuilding has ever been carried out so clean backlink profile, etc. I just resubmitted it and it's back in its usual place, and has maintained the rankings (and PR) it had before it disappeared a few days ago. I checked WMT and no warnings or issues there. Any idea why this might've happened?
Algorithm Updates | | McTaggart0 -
Does Schema.org markup create a conflict with Power Reviews' standard microformat markup for e-commerce product pages?
Does anyone have experience implementing Schema.org markup on e-commerce websites that are already using Power Reviews (now Bazaar)? In Google's documentation they say that it's generally not a good idea to use two types of semantic markup for the same item (reviews in this case), but I wouldn't think that there would be a problem marking up other items on the page with Schema such as price, stock status, etc... Anyone care to provide some insight? Also in a related topic, have you all noticed that Google has really dialed back the frequency in which they display rich snippets for product searches? A few weeks ago the site that I'm referring to had hundreds of products that were displaying snippets, now it seems that only about 10% (roughly) of them are still showing. Thanks everybody.
Algorithm Updates | | BrianCC0 -
Google indexing my website's Search Results pages. Should I block this?
After running the SEOmoz crawl test, i have a spreadsheet of 11,000 urls of which 6381 urls are search results pages from our website that have been indexed. I know I've read that /search should be blocked from the engines, but can't seem to find that information at this point. Does anyone have facts behind why they should be blocked? Or not blocked?
Algorithm Updates | | Jenny10 -
Changes in Sitemap Indexation in GWT?
I've noticed some significant changes in the number and percentage of indexed URLs for the sitemaps we've been submitting to Google. I've been tracking these numbers directly from Google Webmaster Tools>Site Configuration>Sitemaps. We've made some changes that could be causing the changes we're seeing, but I want to confirm that this wasn't just a change in the way Google reports the indexation. Has anyone else noticed major changes, greater than a 30% change, in the indexation of your sitemaps in the past week? Thanks, Joe
Algorithm Updates | | JoeAmadon0 -
Why is a website with lower content interest reaching higher in google
there is a website that i am competing with <cite>www.gastricbandhypnotherapy.net for the term gastric band hypnotherapy and for some reason it is now ranching higher than me.</cite> I have been number one in google with http://www.clairehegarty.co.uk/virtual-gastric-band-with-hypnotherapy for the term Gastric Band Hypnotherapy but for some reason in the past few days it has ranked number one and pushed me down to number three. i do not understand it as there is not much relevant content to gastric band hypnotherapy and also it does not have many links pointing into it can you please help with this question
Algorithm Updates | | ClaireH-1848860