Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I use noindex or robots to remove pages from the Google index?
-
I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed.
From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else.
I can add the noindex tag to the review pages but they wont be crawled.
https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html
Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?
-
Thanks, Logan!
-
Rhys,
Your web dev team is confused. You cannot de-index by simply disallowing them in your robots.txt file. Google will still index anything they find (that doesn't have a noindex tag) from a link, this is the reason you often see search results that say "A description for this result is not available because of this site's robots.txt" as the description.
Here's a quote from Google regarding the subject: "You should not use robots.txt as a means to hide your web pages from Google Search results." - https://support.google.com/webmasters/answer/6062608?hl=en
-
Hi all,
Sorry to jump in here but I've been told the opposite by our web dev team. We're removing indexed 404s at the moment, and our web dev team said we simply need to add robots.txt to the pages and they'll be de-indexed. If this incorrect? I thought I'd need to add a noindex tag but was argued down...
Cheers,
Rhys
-
Hi there. Good question and one that comes up a lot.
You need to do the following:
- Put the noindex on those pages
- Remove the block in robots.txt
- Monitor these pages falling out of the index
- Once they are all out, then put the block back in place
You both want them to a) drop out and b) then not be crawled, so the above will take care of that for you.
Hope that helps!
John
-
Thanks.
That is what I figured just wanted to double check.
-
Hi Tyler,
Yes, remove the robots.txt disallow for that section and add a noindex tag. Noindex is the only sure-fire way to de-index URLs, but the crawlers need to be allowed to crawl those pages to see the tag.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I use the on classified listing pages that have expired?
We have went back and forth on this and wanted to get some outside input. I work for an online listing website that has classified ads on it. These ads are generated by companies on our site advertising weekend events around the country. We have about 10,000 companies that use our service to generate their online ads. This means that we have thousands of pages being created each week. The ads have lots of content: pictures, sale descriptions, and company information. After the ads have expired, and the sale is no longer happening, we are currently placing the in the heads of each page. The content is not relative anymore since the ad has ended. The only value the content offers a searcher is the images (there are millions on expired ads) and the descriptions of the items for sale. We currently are the leader in our industry and control most of the top spots on Google for our keywords. We have been worried about cluttering up the search results with pages of ads that are expired. In our Moz account right now we currently have over 28k crawler warnings alerting us to the being in the page heads of the expired ads. Seeing those warnings have made us nervous and second guessing what we are doing. Does anybody have any thoughts on this? Should we continue with placing the in the heads of the expired ads, or should we be allowing search engines to index the old pages. I have seen websites with discontinued products keeping the products around so that individuals can look up past information. This is the closest thing have seen to our situation. Any help or insight would be greatly appreciated! -Matt
Intermediate & Advanced SEO | | mellison0 -
How to check if the page is indexable for SEs?
Hi, I'm building the extension for Chrome, which should show me the status of the indexability of the page I'm on. So, I need to know all the methods to check if the page has the potential to be crawled and indexed by a Search Engines. I've come up with a few methods: Check the URL in robots.txt file (if it's not disallowed) Check page metas (if there are not noindex meta) Check if page is the same for unregistered users (for those pages only available for registered users of the site) Are there any more methods to check if a particular page is indexable (or not closed for indexation) by Search Engines? Thanks in advance!
Intermediate & Advanced SEO | | boostaman0 -
Will Google View Using Google Translate As Duplicate?
If I have a page in English, which exist on 100 other websites, we have a case where my website has duplicate content. What if I use Google Translate to translate the page from English to Japanese, as the only website doing this translation will my page get credit for producing original content? Or, will Google view my page as duplicate content, because Google can tell it is translated from an original English page, which runs on 100+ different websites, since Google Translate is Google's own software?
Intermediate & Advanced SEO | | khi50 -
Effect of Removing Footer Links In all Pages Except Home Page
Dear MOZ Community: In an effort to improve the user interface of our business website (a New York CIty commercial real estate agency) my designer eliminated a standardized footer containing links to about 20 pages. The new design maintains this footer on the home page, but all other pages (about 600 eliminate the footer). The new design does a very good job eliminating non essential items. Most of the changes remove or reduce the size of unnecessary design elements. The footer removal is the only change really effect the link structure. The new design is not launched yet. Hoping to receive some good advice from the MOZ community before proceeding My concern is that removing these links could have an adverse or unpredictable effect on ranking. Last Summer we launched a completely redesigned version of the site and our ranking collapsed for 3 months. However unlike the previous upgrade this modifications does not URL names, tags, text or any major element. Only major change is the footer removal. Some of the footer pages provide good (not critical) info for visitors. Note the footer will still appear on the home page but will be removed on the interior pages. Are we risking any detrimental ranking effect by removing this footer? Can we compensate by adding text links to these pages if the links from the footer are removed? Seems irregular to have a home page footer but no footer on the other pages. Are we inviting any downgrade, penalty, adverse SEO effect by implementing this? I very much like the new design but do not want to risk a fall in rank and traffic. Thanks for your input!!!
Intermediate & Advanced SEO | | Kingalan1
Alan0 -
301 redirection pointing to noindexed pages
I have rather an unusual situation where a recently launched affiliate site does not have any unique content as its all syndicated content. For that reason we are currently using the noindex,nofollow meta tags to keep the pages out of the search engines index until we create unique content for the pages. The problem is that due to a very tight timeframe with rebranding, we are looking at 301 redirecting (on a page to page basis) another high authority legacy domain to this new site before we have had a chance to add unique content to it and remove the noindex,nofollow tags. I would assume that any link authority normally passed through the 301 would be lost in this scenario but Im uncertain of what the broader impact might be. Has anyone dealt with a similar scenario? I know this scenario is not ideal and I would rather wait until the unique content is up and noindex tags are removed before launching the 301 redirect of the legacy domain but there are a number of competing priorities at play outside of SEO.
Intermediate & Advanced SEO | | LosNomads0 -
Meta NoIndex tag and Robots Disallow
Hi all, I hope you can spend some time to answer my first of a few questions 🙂 We are running a Magento site - layered/faceted navigation nightmare has created thousands of duplicate URLS! Anyway, during my process to tackle the issue, I disallowed in Robots.txt anything in the querystring that was not a p (allowed this for pagination). After checking some pages in Google, I did a site:www.mydomain.com/specificpage.html and a few duplicates came up along with the original with
Intermediate & Advanced SEO | | bjs2010
"There is no information about this page because it is blocked by robots.txt" So I had added in Meta Noindex, follow on all these duplicates also but I guess it wasnt being read because of Robots.txt. So coming to my question. Did robots.txt block access to these pages? If so, were these already in the index and after disallowing it with robots, Googlebot could not read Meta No index? Does Meta Noindex Follow on pages actually help Googlebot decide to remove these pages from index? I thought Robots would stop and prevent indexation? But I've read this:
"Noindex is a funny thing, it actually doesn’t mean “You can’t index this”, it means “You can’t show this in search results”. Robots.txt disallow means “You can’t index this” but it doesn’t mean “You can’t show it in the search results”. I'm a bit confused about how to use these in both preventing duplicate content in the first place and then helping to address dupe content once it's already in the index. Thanks! B0 -
Should I prevent Google from indexing blog tag and category pages?
I am working on a website that has a regularly updated Wordpress blog and am unsure whether or not the category and tag pages should be indexable. The blog posts are often outranked by the tag and category pages and they are ultimately leaving me with a duplicate content issue. With this in mind, I assumed that the best thing to do would be to remove the tag and category pages from the index, but after speaking to someone else about the issue, I am no longer sure. I have tried researching online, but there isn't anything that provided any further information. Please can anyone with any experience of dealing with issues like this or with any knowledge of the topic help me to resolve this annoying issue. Any input will be greatly appreciated. Thanks Paul
Intermediate & Advanced SEO | | PaulRogers0 -
All page files in root? Or to use directories?
We have thousands of pages on our website; news articles, forum topics, download pages... etc - and at present they all reside in the root of the domain /. For example: /aosta-valley-i6816.html
Intermediate & Advanced SEO | | Peter264
/flight-sim-concorde-d1101.html
/what-is-best-addon-t3360.html We are considering moving over to a new URL system where we use directories. For example, the above URLs would be the following: /images/aosta-valley-i6816.html
/downloads/flight-sim-concorde-d1101.html
/forums/what-is-best-addon-t3360.html Would we have any benefit in using directories for SEO purposes? Would our current system perhaps mean too many files in the root / flagging as spammy? Would it be even better to use the following system which removes file endings completely and suggests each page is a directory: /images/aosta-valley/6816/
/downloads/flight-sim-concorde/1101/
/forums/what-is-best-addon/3360/ If so, what would be better: /images/aosta-valley/6816/ or /images/6816/aosta-valley/ Just looking for some clarity to our problem! Thank you for your help guys!0