Dilemma about "images" folder in robots.txt
-
Hi, Hope you're doing well.
I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
-
Yup my images send me traffic from Google images on most of my sites and attractive images attract hotlinks as well. At the moment people are hosting their images on a different domain (cdn) and are still being credited with the images but I haven't tried to do that myself ie I don't know if they've set some "ownership" somewhere and somehow.
-
I recommend allowing Google to crawl those images. Google optimizes its crawl rate and once it has done a complete crawl it will understand how often to crawl certain areas of your site. My main concern would be that you are losing potential rankings and indexing from those images - if they are unique and high quality you definitely want them to index the images, understand the file names, and appropriately index them.
I wouldn't be concerned about Google bot eating up your server resources. If it does become a problem, then you can go back and adjust the bot access through the robots.txt, as you've done already. However, I would let them in first and only react if it becomes a problem.
I have tens of thousands of product images accessed by the google bot and it is no concern to my ecommerce company and the server resources. I'm not saying that it can't be a potential problem, but the benefit outweighs the risk of it being one - I choose a reactive stance in this situation.
Closely monitor your Google Webmaster Tools account, watch the crawl rate and statistics, and if it becomes an issue then decide on which image folders should or shouldn't be indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rel="prev" / "next"
Hi guys, The tech department implemented rel="prev" and rel="next" on this website a long time ago.
Intermediate & Advanced SEO | | AdenaSEO
We also added a canonical tag to the 'own' page. We're talking about the following situation: https://bit.ly/2H3HpRD However we still see a situation where a lot of paginated pages are visible in the SERP.
Is this just a case of rel="prev" and "next" being directives to Google?
And in this specific case, Google deciding to not only show the 1st page in the SERP, but still show most of the paginated pages in the SERP? Please let me know, what you think. Regards,
Tom1 -
Strange 404s in GWT - "Linked From" pages that never existed
I’m having an issue with Google Webmaster Tools saying there are 404 errors on my site. When I look into my “Not Found” errors I see URLs like this one: Real-Estate-1/Rentals-Wanted-228/Myrtle-Beach-202/subcatsubc/ When I click on that and go to the “Linked From” tab, GWT says the page is being linked from http://www.myrtlebeach.com/Real-Estate-1/Rentals-Wanted-228/Myrtle-Beach-202/subcatsubc/ The problem here is that page has never existed on myrtlebeach.com, making it impossible for anything to be “linked from” that page. Many more strange URLs like this one are also showing as 404 errors. All of these contain “subcatsubc” somewhere in the URL. My Question: If that page has never existed on myrtlebeach.com, how is it possible to be linking to itself and causing a 404?
Intermediate & Advanced SEO | | Fuel0 -
Using unique content from "rel=canonical"ized page
Hey everyone, I have a question about the following scenario: Page 1: Text A, Text B, Text C Page 2 (rel=canonical to Page 1): Text A, Text B, Text C, Text D Much of the content on page 2 is "rel=canonical"ized to page 1 to signalize duplicate content. However, Page 2 also contains some unique text not found in Page 1. How safe is it to use the unique content from Page 2 on a new page (Page 3) if the intention is to rank Page 3? Does that make any sense? 🙂
Intermediate & Advanced SEO | | ipancake0 -
App "Review" Website with DA of 58 - Good or Bad Link?
Hi, We have a web app. All our competitors are on http://www.appappeal.com. We can suggest ourselves here http://www.appappeal.com/contact/suggest. If we get reviewed and the link is a follow link is this a good thing or a bad thing. They call themselves a directory and you can pay to get a "priority" review. Should we avoid or is it a good link as the DA is 58?
Intermediate & Advanced SEO | | Studio330 -
After Receiving a "Googlebot can't access your site" would this stop your site from being crawled?
Hi Everyone,
Intermediate & Advanced SEO | | AMA-DataSet
A few weeks ago now I received a "Googlebot can't access your site..... connection failure rate is 7.8%" message from the webmaster tools, I have since fixed the majority of these issues but iv noticed that all page except the main home page now have a page rank of N/A while the home page has a page rank of 5 still. Has this connectivity issues reduced the page ranks to N/A? or is it something else I'm missing? Thanks in advance.0 -
Images and SEO
Hi, I would like some opinions on the topic of using images for SEO. I have come across a few sites that I see have very few backlinks, but have decent pagerank and seem to rank well for certain keywords. One such site I looked at had very little content other than tons of images (It was a joke blog that focussed on funny images, funny pics etc) and now I am starting to question whether hotlinking images assists in SEO? are there any benefits to having someone using one of your images (hosted on your site) ? I do recall reading somewhere that someone hotlinking an image is akin to a link. Any truth in this?
Intermediate & Advanced SEO | | rightmove0 -
"Too many links" - PageRank question
This question seems to come up a lot. 70 flat page site. For ease of navigation, I want to link every page to one-another. Pure CSS Dropdown menu with categories - each expanding to each of the subpage. Made, implemented, remade smartphone friendly. Hurray. I thought this was an SEO principle - ensuring good site navigation and good internal linking. Not forcing your users to hit "back". Not forcing your users to jump through hoops. But unless I've misread http://www.seomoz.org/blog/how-many-links-is-too-many then this is something that's indirectly penalised by Google because a site with 70 links from its homepage only lets each sub-page inherit 1/80th of its PageRank. Good site navigation vs your subpages are invisible on Google.
Intermediate & Advanced SEO | | JamesFx0 -
Shall I fix "most Common Errors" for a website that ranked top 3 on Google (difficult KW)?
How can SEOmoz "most Common Errors*" under "Crawl Diagnostics" advice can be right for a good site organic? Site is well ranked top 3 on Google (difficult KW). If I go ahead and fix these errors, I might hurt my SEO , no? like: Too Many On-Page Links 302 (Temporary Redirect) Title Element Too Long (> 70 Characters) Missing Meta Description Tag
Intermediate & Advanced SEO | | Elchanan0