Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
SEO Best Practices regarding Robots.txt disallow
-
I cannot find hard and fast direction about the following issue:
It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs?
I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ?
Thank you!!
-
mmm it depends.
it's really hard for me to answer without knowing your site but I would say that you're in the good direction. You want to provide google more ways to reach your quality content.
Now do you have any other page that is bringing bots there via a normal user navigation or is it all search driven?
While google can crawl pages that discovered via internal/external links it can't reproduce searches by typing in your nav bar, so I doubt those pages should be extremely valuable unless you link to them somehow. In that case you may want to keep google crawling them.
A different thing would be if you want to "index" them, as being searches they are probably aggregating different information already present on the site. For indexation purposes you may want to keep them out of the index while still allowing the bot to run through them.
Again beware of the crawl budget, you don't want google to be wandering around millions of search results instead of your money pages, unless you're able to let them crawl only a sub portion of that.
I hope this made sense
-
Thank you for your response! I'm going to do a bit more research but I think I will disallow "account", but unblock "search". The search feature on my site pulls up quality content, so seems like I would want that to be crawled. Does this sound logical to you?
-
That could be completely normal. Google sends a warning because you're giving conflicting directions as you are preventing them to crawl pages (via robots) you asked them to index (via sitemap).
They do not know how important those pages may be for you so you are the one that needs to assess what to do net.
Are those pages important for you? Do you want them to be in the index? if that's the case change your robots.txt rule, if not then remove them from the sitemap.
About the previous answer robots text is not used to block hackers but quite the opposite. Hackers can easily find via the robots txt which are the pages you'd like to block and visit them as they may be key pages (ex. wp-admin), but let's not focus on that as hackers have so many ways to find core pages that it's not the topic. Robots txt is normally used to avoid duplication issues and to prevent google from crawling low value pages and waste crawl budget.
-
Typically, you only want robots.txt to block access points that would allow hackers into your site like an admin page (e.g. www.examplesite.com/admin/). You definitely don't want it blocking your whole site. A developer or webmaster would be better at speaking to the specifics, but that's the quick, high-level answer.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What happens to crawled URLs subsequently blocked by robots.txt?
We have a very large store with 278,146 individual product pages. Since these are all various sizes and packaging quantities of less than 200 product categories my feeling is that Google would be better off making sure our category pages are indexed. I would like to block all product pages via robots.txt until we are sure all category pages are indexed, then unblock them. Our product pages rarely change, no ratings or product reviews so there is little reason for a search engine to revisit a product page. The sales team is afraid blocking a previously indexed product page will result in in it being removed from the Google index and would prefer to submit the categories by hand, 10 per day via requested crawling. Which is the better practice?
Intermediate & Advanced SEO | Jul 27, 2021, 9:02 PM | AspenFasteners1 -
SEO on dynamic website
Hi. I am hoping you can advise. I have a client in one of my training groups and their site is a golf booking engine where all pages are dynamically created based on parameters used in their website search. They want to know what is the best thing to do for SEO. They have some landing pages that Google can see but there is only a small bit of text at the top and the rest of the page is dynamically created. I have advised that they should create landing pages for each of their locations and clubs and use canonicals to handle what Google indexes.Is this the right advice or should they noindex? Thanks S
Intermediate & Advanced SEO | Jun 26, 2018, 6:28 AM | bedynamic0 -
Best SEO for table in mobile view
I'm wondering what the best way to present a table for mobile view in terms of SEO? It's a complicated table (not simple rows & columns but also col spans) which doesn't work with any responsive techniques I can find. I can offer different content for desktop / mobile so desktop is OK. But what's the best way forward with Google for mobile? I could offer a jpg or simply an explanation to revisit the page on desktop, but neither of those options seem particularly Google-friendly?
Intermediate & Advanced SEO | Feb 22, 2018, 8:21 PM | Ann640 -
Mass URL changes and redirecting those old URLS to the new. What is SEO Risk and best practices?
Hello good people of the MOZ community, I am looking to do a mass edit of URLS on content pages within our sites. The way these were initially setup was to be unique by having the date in the URL which was a few years ago and can make evergreen content now seem dated. The new URLS would follow a better folder path style naming convention and would be way better URLS overall. Some examples of the **old **URLS would be https://www.inlineskates.com/Buying-Guide-for-Inline-Skates/buying-guide-9-17-2012,default,pg.html
Intermediate & Advanced SEO | Feb 14, 2018, 1:55 PM | kirin44355
https://www.inlineskates.com/Buying-Guide-for-Kids-Inline-Skates/buying-guide-11-13-2012,default,pg.html
https://www.inlineskates.com/Buying-Guide-for-Inline-Hockey-Skates/buying-guide-9-3-2012,default,pg.html
https://www.inlineskates.com/Buying-Guide-for-Aggressive-Skates/buying-guide-7-19-2012,default,pg.html The new URLS would look like this which would be a great improvement https://www.inlineskates.com/Learn/Buying-Guide-for-Inline-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Kids-Inline-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Inline-Hockey-Skates,default,pg.html
https://www.inlineskates.com/Learn/Buying-Guide-for-Aggressive-Skates,default,pg.html My worry is that we do rank fairly well organically for some of the content and don't want to anger the google machine. The way I would be doing the process would be to edit the URLS to the new layout, then do the redirect for them and push live. Is there a great SEO risk to doing this?
Is there a way to do a mass "Fetch as googlebot" to reindex these if I do say 50 a day? I only see the ability to do 1 URL at a time in the webmaster backend.
Is there anything else I am missing? I believe this change would overall be good in the long run but do not want to take a huge hit initially by doing something incorrectly. This would be done on 5- to a couple hundred links across various sites I manage. Thanks in advance,
Chris Gorski0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | Apr 28, 2016, 11:05 AM | andyheath0 -
Is .ME domain is effective in SEO ?
I am always listening about TLD. com. org .net but what about the .me domain. Can this will be effective in SEO. Can i able to beat down my competitors, if i choose .me . I also have a .com or other TLD option but if i am making my name than .me is for me but i need your suggestion for the seo purpose. Is there really domain affective in term of SEO.
Intermediate & Advanced SEO | Jun 11, 2014, 12:29 PM | pnb5670 -
Block an entire subdomain with robots.txt?
Is it possible to block an entire subdomain with robots.txt? I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
Intermediate & Advanced SEO | Sep 2, 2011, 12:28 PM | kylesuss12 -
What's your best hidden SEO secret?
Don't take that question too serious but all answers are welcome 😉 Answer to all:
Intermediate & Advanced SEO | Jan 16, 2014, 8:23 PM | petrakraft
"Gentlemen, I see you did you best - at least I hope so! But after all I suppose I am stuck here to go on reading the SEOmoz blog if I can't sqeeze more secrets from you!9