Should I use noindex or robots to remove pages from the Google index?

Tylerj

I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed.

From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else.

I can add the noindex tag to the review pages but they wont be crawled.

https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html

Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?

SwanseaMedicine

Thanks, Logan!

LoganRay

Rhys,

Your web dev team is confused. You cannot de-index by simply disallowing them in your robots.txt file. Google will still index anything they find (that doesn't have a noindex tag) from a link, this is the reason you often see search results that say "A description for this result is not available because of this site's robots.txt" as the description.

Here's a quote from Google regarding the subject: "You should not use robots.txt as a means to hide your web pages from Google Search results." - https://support.google.com/webmasters/answer/6062608?hl=en

SwanseaMedicine

Hi all,

Sorry to jump in here but I've been told the opposite by our web dev team. We're removing indexed 404s at the moment, and our web dev team said we simply need to add robots.txt to the pages and they'll be de-indexed. If this incorrect? I thought I'd need to add a noindex tag but was argued down...

Cheers,

Rhys

dohertyjf

Hi there. Good question and one that comes up a lot.

You need to do the following:

Put the noindex on those pages
Remove the block in robots.txt
Monitor these pages falling out of the index
Once they are all out, then put the block back in place

You both want them to a) drop out and b) then not be crawled, so the above will take care of that for you.

Hope that helps!

John

Tylerj

Thanks.

That is what I figured just wanted to double check.

LoganRay

Hi Tyler,

Yes, remove the robots.txt disallow for that section and add a noindex tag. Noindex is the only sure-fire way to de-index URLs, but the crawlers need to be allowed to crawl those pages to see the tag.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Should I use noindex or robots to remove pages from the Google index?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Is it ok to repeat a (focus) keyword used on a previous page, on a new page?

If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?

Date of page first indexed or age of a page?

Using subdomains for related landing pages?

NoIndexing Massive Pages all at once: Good or bad?

How to Disallow Tag Pages With Robot.txt

There's a website I'm working with that has a .php extension. All the pages do. What's the best practice to remove the .php extension across all pages?

Tool to calculate the number of pages in Google's index?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved