Robots.txt - Googlebot - Allow... what's it for?
-
Hello - I just came across this in robots.txt for the first time, and was wondering why it is used? Why would you have to proactively tell Googlebot to crawl JS/CSS and why would you want it to? Any help would be much appreciated - thanks, Luke
User-Agent: Googlebot
Allow: /.js
Allow: /.css
-
Thanks Tom - that's very useful - appreciated - and thanks also Clever PhD re: the robots.txt tester info - Luke
-
Just as a follow-up to Tom's great post. If you were wanting to test a robots.txt setup, especially if you were using a wildcard or using an allow combined with a disallow, Google Search Console under the Crawl section has a robots.txt Tester. You will see your most recent robots.txt file there that Google has a copy of. You can then modify that version and then enter a URL at the bottom to see if everything is set correctly or not. It is pretty handy, especially if you have a big robots.txt file. Note that this tool does not change how Google crawls your site or your robots.txt file, it is just for testing. Once you find the configuration that works, you would still need to update the robots.txt on your server.
-
Hi Luke
As you have correctly assumed, that particular robots command would be pointless.
The Googlebot does follow allow commands (while other ones do not), but it should only be used if it is an exception to a disallow rule.
So, for example, if you had a rule that blocked pages within a sub-directory, with:
Disallow: /example/*
You could create an allow rule that indexes a specific page within that directory to be indexed, like:
Allow: /example/page.html
Couple of things to point out here. "At a group-member level, in particular for allow and disallow directives, the most specific rule based on the length of the [path] entry will trump the less specific (shorter) rule." (Google Source). In this example, because the more specific rule is the allow rule, that will prevail. It is also best practice to put your "allow" rules at the top of the robots.txt file.
But in your example, if they have allow rules for JS and CSS files without having disavow rules for those directories/paths etc - it's a waste of space. Google will attempt to crawl anything it can by default - unless you disavow access.
TL;DR - You don't need to proactively tell Google to crawl CSS and JS - it will by default.
Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need Magento SEO expert for 301 clean up - any reco's?
My site is a total mess from a clean “crawling” perspective. We are still getting traffic and doing business, but I am afraid from an SEO perspective we are driving with the parking brake on. There a lot of 301’s and some of them are causing 404 errors. Below is an overview of my 5 year old magento site which was moved from a 5 year old xcart site (so there is a lot of old junk (url’s) in there). It needs cleaning up and I need a plan and seo / 301 help. Overview: Recently moved from http to https - not sure best practices were followed, but we had lots of crawl issues before this move. Analytics Top 100 Landing Pages = 82.7% of entrances Webmaster Tools 594 Pages Indexed 65 Not found errors - most involve 301’s - examples below Sitemap: 773 Submitted, 395 Indexed URL Parameters - 41 - I can’t tell if they are doing anything (helping or hurting) Moz Crawl Total Pages 3,454 324 Redirect Issues (258 Temp and 66 Chain) Magento 11,773 Redirects 5390 System 6383 Custom On July 15, 2017 I deleted 40 redirects from htaccess that a developer had put there that were causing problems. Blog We have a wordpress blog installed on Magento site. Years ago it was moved from a subdomain to a subdirectory.
Intermediate & Advanced SEO | | SammyT0 -
Pagination and View All Pages Question. We currently don't have a canonical tag pointing to View all as I don't believe it's a good user experience so how best we deal with this.
Hello All, I have an eCommerce site and have implemented the use rel="prev" and rel="next" for Page Pagination. However, we also have a View All which shows all the products but we currently don't have a canonical tag pointing to this as I don't believe showing the user a page with shed loads of products on it is actually a good user experience so we havent done anything with this page. I have a sample url from one of our categories which may help - http://goo.gl/9LPDOZ This is obviously causing me duplication issues as well . Also , the main category pages has historically been the pages which ranks better as opposed to Page 2, Page 3 etc etc. I am wondering what I should do about the View All Page and has anyone else had this same issue and how did they deal with it. Do we just get rid of the View All even though Google says it prefers you to have it ? I also want to concentrate my link juice on the main category pages as opposed being diluted between all my paginated pages ? - Does anyone have any tips on how to best do this and have you seen any ranking improvement from this ? Any ideas greatly appreciated. thanks Peter
Intermediate & Advanced SEO | | PeteC120 -
Ecommerce URL's
I'm a bit divided about the URL structure for ecommerce sites. I'm using Magento and I have Canonical URLs plugin installed. My question is about the URL structure and length. 1st Way: If I set up Product to have categories in the URL it will appear like this mysite.com/category/subcategory/product/ - and while the product can be in multiple places , the Canonical URL can be either short or long. The advantage of having this URL is that it shows all the categories in the breadcrumbs ( and a whole lot more links over the site ) . The disadvantage is the URL Length 2nd Way: Setting up the product to have no category in the URL URL will be mysite.com/product/ Advantage: short URL. disadvantage - doesn't show the categories in the breadcrumbs if you link direct. Thoughts?
Intermediate & Advanced SEO | | s_EOgi_Bear1 -
Massive URL blockage by robots.txt
Hello people, In May there has been a dramatic increase in blocked URLs by robots.txt, even though we don't have so many URLs or crawl errors. You can view the attachment to see how it went up. The thing is the company hasn't touched the text file since 2012. What might be causing the problem? Can this result any penalties? Can indexation be lowered because of this? ?di=1113766463681
Intermediate & Advanced SEO | | moneywise_test0 -
Is it possible to lose rank because my site's IP changed?
I manage a site on the 3dCart e-commerce platform. I recently updated the SSL certificate. Today, when I tried to log-in via FTP, I couldn't connect. The reason I couldn't connect was because my IP had changed. Last week the site experienced almost across the board rankings drops on lmost every important keyword. Not gigantic drops, a lot just lost 2-4 postiions, but that's a lot when you were #2 and you drop to #4 or # 6. Initially I thought it was because I was attempting to markup my product pages using structured data following guidelines from schema.org. I am not a coder so it was a real struggle, especially trying to navigate 3dCart's listing templates. I thought the rankings drops were Google slapping me for bad code, but now I wonder....could I really have dropped down because of that IP address change? Does anyone have a take on this? Thanks!
Intermediate & Advanced SEO | | danatanseo0 -
A Noob's SEO Plan of attack... can you critique it for me?
I've been digging my teeth into SEO for a solid 1.5 weeks or so now and I've learned a tremendous amount. However, I realize I have only scratched the surface still. One of the hardest things I've struggled with is the sheer amount of information and feeling overwhelmed. I finally think I've found a decent path. Please critique and offer input, it would be much appreciated. Step One: Site Architecture I run an online proofreading & editing service. That being said, there are lots of different segment we would eventually like to rank for other than the catch-all phrases like 'proofreading service'. For example, 'essay editing', 'resume editing', 'book editing', or even 'law school personal statement editing'. I feel that my first step is to understand how my site is built to handle this plan now, and into the future. Right now we simply have the homepage and one segment: kibin.com/essay-editing. Eventually, we will have a services page that serves almost like a site-map, showing all of our different services and linking to them. Step Two: Page Anatomy I know it is important to have a well defined anatomy to these services pages. For example, we've done a decent job with 'above the fold' content, but now understand the importance of putting the same type of care in below the fold. The plan here is to have a section for recent blog posts that pertain to that subject in a section titled "Essay Editing and Essay Writing Tips & Advice", or something to that effect. Also including some social sharing options, other resources, and an 'about us' section to assist with keyword optimization is in the plan. Step Three: Page Optimization Once we're done with Step Two, I feel that we'll finally be ready to truly optimize each of our pages. We've down some of this already, but probably less than 50%. You can see evidence of this on our essay editing page and proofreading rates page. So, the goal here is to find the most relevant keywords for each page and optimize for those to the point we have A grades on our on-page optimization reports. Step Four: Content/Passive Link Building The bones for our content strategy is in place. We have sharing links on blog posts already in place and a slight social media presence already. I admit, the blog needs some tightening up, and we can do a lot more on our social channels. However, I feel we need to start by creating content that our audience is interested in and interacting with them on a consistent basis. I do not feel like I should be chasing link building strategies or guest blog posts at this time. PLEASE correct me if I'm off base here, but only after reading step five: Step Five: Active Link Building My bias is to get some solid months of creating content and building a good social media presence where people are obviously interacting with our posts and sharing our content. My reasoning is that it will make it much easier for me to reach out to bloggers for guest posts as we'll be much more reputable after spending time doing step 4. Is this poor thinking? Should I try to get some guest blog posts in during step 4 instead? Step Six: Test, Measure, Refine I'll admit, I have yet to really dive into learning about the different ways to measure our SEO efforts. Besides being set up with our first campaign as an SEOPro Member and having 100 or so keywords and phrases we're tracking... I'm really not sure what else to do at this point. However, I feel we'll be able to measure the popularity of each blog post by number of comments, shares, new links, etc. once I reach step 6. Is there something vital I'm missing or have forgotten here? I'm sorry for the long winded post, but I'm trying to get my thoughts straight before we start cranking on this plan. Thank you so much!
Intermediate & Advanced SEO | | TBiz2 -
Meeting Google's needs 100% with dynamic pages
We have bought into a really powerful search, very exciting We can define really detailed product based 'landing pages' by creating a search that pulles on required attributeseghttp://www.OURDOMAIN.com//search/index.php?sortprice=asc&followSearch=9673&q=red+coats+short-length Pop that in a link Short Red Coats on a previous page and wonderful, that gives a page of short red coats in price ascending order, one happy consumer, straight to a page that meets their needs Question 1 however unhappy Google right? Question 2 can we meet Google's needs 100% with a redirect permanent in an .htaccess file E.G redirect permanent /short-red-coats/ http://www.OURDOMAIN.com//search/index.php?sortprice=asc&followSearch=9673&q=red+coats+short-length
Intermediate & Advanced SEO | | GeezerG
Many thanks
CB0 -
Best solution to get mass URl's out the SE's index
Hi, I've got an issue where our web developers have made a mistake on our website by messing up some URL's . Because our site works dynamically IE the URL's generated on a page are relevant to the current URL it ment the problem URL linked out to more problem URL's - effectively replicating an entire website directory under problem URL's - this has caused tens of thousands of URL's in SE's indexes which shouldn't be there. So say for example the problem URL's are like www.mysite.com/incorrect-directory/folder1/page1/ It seems I can correct this by doing the following: 1/. Use Robots.txt to disallow access to /incorrect-directory/* 2/. 301 the urls like this:
Intermediate & Advanced SEO | | James77
www.mysite.com/incorrect-directory/folder1/page1/
301 to:
www.mysite.com/correct-directory/folder1/page1/ 3/. 301 URL's to the root correct directory like this:
www.mysite.com/incorrect-directory/folder1/page1/
www.mysite.com/incorrect-directory/folder1/page2/
www.mysite.com/incorrect-directory/folder2/ 301 to:
www.mysite.com/correct-directory/ Which method do you think is the best solution? - I doubt there is any link juice benifit from 301'ing URL's as there shouldn't be any external links pointing to the wrong URL's.0