Crawling image folders / crawl allowance
-
We recently removed /img and /imgp from our robots.txt file thus allowing googlebot to crawl our image folders. Not sure why we had these blocked in the first place, but we opened them up in response to an email from Google Product Search about not being able to crawl images - which can/has hurt our traffic from Google Shopping.
My question is: will allowing Google to crawl our image files eat up our 'crawl allowance'? We wouldn't want Google to not crawl/index certain pages, and ding our organic traffic, because more of our allotted crawl bandwidth is getting chewed up crawling image files.
Outside of the non-detailed crawl stat graphs from Webmaster Tools, what's the best way to check how frequently/ deeply our site is getting crawled?
Thanks all!
-
I did this accidentally as well recently and had 100% of my products disallowed from google shopping within 48 hours. Sounds like it's not an option. They need the crawl your images folder to make sure you have valid images in you product listings.
-
if your rankings are improving, then good move!
-
Hey Richard,
We were previously blocking googlebot from crawling our images at all (through disallowing /img/ and /imgp/ in robots.txt file. We removed this block after recieving this email from Google:
Thank you for participating in Google Product Search. It has come to our attention that a robots.txt file is preventing us from crawling some or all of the images on your site. In order for us to access and display the images you provide in your product listings, we'd like you to modify your robots.txt file to allow user-agent 'googlebot' to crawl your site.
_Failure for Google to access your images may affect the visibility of your items on Google Product Search and Product Ad results. _
While I totally agree that image traffic will not convert like standard traffic, it is free and who knows, we may just pick up a few sales from it. Of course if this comes at the cost of eating up a disproportionate amount of our crawl allowance relative to the value (or avoiding any penalties from Google Product Search) we'd be better off leaving the block on.
By way of an update, it looks like our rankings have started to improve in Google product search. We first experienced a drop in rankings and traffic from Product Search on 4/16 and removed the block from robots.txt on 4/22.
-
Why do you need Google to reach inside your img folder? Images display on the page and are indexed then. Sure, if you are selling images, then I can see the need for this, but to just crawl the img folder??
If it is not huge, I do not see it penalizing you. I would make sure all images are named using keywords as crawling pic001.jpg, pic002.jpg, product01.jpg, logo.gif will not do you any good anyway.
Also I find bad linking coming from Google image searches. No one searches to purchase a coffee cup and looks in Google images to do so. Conversely, if someone is searching images of coffee cups to use in whatever, having them click over to your site is a waste of time. They are just going to grab the image and go leaving your metrics a mess.
I hope that helps.
-
It may effect crawl allowance but depends on the size of your site, page rank and trust etc.
One of the best ways to determine crawl depth and whether you have any issues is to create separate sitemaps for your most important content or areas of your site. You could also create an image sitemap.
Then you can monitor these over time and and will give you a good picture of which content is being crawled and indexed well and which content/images are not. This may also help you to find out if the site structure is too deep or whether you need to link more to deeper content in order to improve crawling and indexation.
Hope this helps.
-
Personally, I wouldn't try to figure out the impact by looking at crawl stats. I'd be more focused on end results. Have we had an increase in organic traffic, or conversions from Google shopping since we opened it up, or has either of these gone down?
That's what matters, and is the only real indicator as to whether it was a wise move or not.
-
You could check your server stats on who is accessing your site, this should tell you what bots are going to your pages when. I don't know what control panel you are using for your site, but if you are using Cpanel, I am sure there are tutorials online to help you find this information.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blogger /blog Folder level redirect setup using .htaccess
We have a blog currently powered by the free blogger.com website. We have set it up as blog.example.com we wish to seti it up as example.com/blog how can we do this using .htaccess file? we understand how to update htacess, but we don't know what code we should enter to achieve what we want our website is hosted on Apache servers with plesk control panel
Technical SEO | | Direct_Ram0 -
Retaining Image Search Rankings After Migration
Hi There, I have a client with a very interesting dilemma out there. If you do an image search his images appear quite high in the rankings. However the way he achieved this isn't exactly within Google's guidelines. He is basically hiding the images within CSS. The reason behind this is that the pages have changed over the years and the images didn't fit in with the new existing text but he still wanted to maintain the high image search rankings. He is now changing to a brand new site and so this page he has been able to tweak successfully before, will no longer exist. He want's to know what is the best way to maintain his image search rankings. will a 301 redirect be enough? I know the morality issues of hiding images, but I want to know if he did what would be the best way to preserve his current image rankings. Kind Regards Neil
Technical SEO | | nezona0 -
Redirect of https:// to http:// without SSL. Possible or not?!
Good afternoon, smart dudes : ) I am here to ask for your help. I posted this question on google help forum and stackoverflow, but looks like people do not know the correct answer... QUESTION: We used to have a secured site, but recently purchased a separate reservation software that provides SSL (takes clients to a separate secured website) where they can fill out the reservation form. We cancelled our SSL (just think its a waste to pay $100 for securing plain text). Now i have so many links pointing to our secured site and i have no idea how to fix it! How do i redirect https://www.mysite.comto http://www.mysite.com.Also would like to mention that i already have redirect from non www to www domain (not sure if that matters): RewriteEngine onRewriteCond %{HTTP_HOST} ^mysite.com$ [NC]RewriteRule ^(.*)$ http://www.mysite.com/$1 [R=301,L]As i already mentioned....we do not have SSL!!!! None of those 301 redirect codes i found online work (you have to have SSL for the site to be redirected from https to http | currently i get an error - can't establish a secured connection to the server ). Is there anything i can do???? Or do i have to purchase SSL again?
Technical SEO | | JennaD140 -
Image Indexing Issue by Google
Hello All,My URL is: www.thesalebox.comI have Submitted my image Sitemap in google webmaster tool on 10th Oct 2013,Still google could not indexing any of my web images,Please refer my sitemap - www.thesalebox.com/AppliancesHomeEntertainment.xml and www.thesalebox.com/Hardware.xmland my webmaster status and image indexing status are below, Can you please help me, why my images are not indexing in google yet? is there any issue? please give me suggestions?Thanks!
Technical SEO | | CommercePundit0 -
Cost/Benefit of modifying a URL
Just as the title says, I'm looking for the cost/benefit breakdown of modifying a URL for SEO purposes. What are some examples of issues where the benefit outweighs the cost, and vise-versa? Thanks all! Frank
Technical SEO | | FrankSweeney0 -
Blog.furnacefilterscanada.com/ or furnacefilterscanada.com/blog/
My shopping cart does not allow to instal a WordPress blog on a sub-domain like: furnacefilterscanada.com/blog/ But I can host my blog on another server with a sub-domain like: blog.furnacefilterscanada.com In a SEO point of view is there a difference between the 2? Link juice? Page authority? Thank you, BigBlaze
Technical SEO | | BigBlaze2050 -
Block /tag/ or not?
I've asked this question in another area but now i want to ask it as a bigger question. Do we block /tag/ with robots.txt or not. Here's why I ask: My wordpress site does not block /tag/ and I have many /tag/ results in the top 10 results of Google. Have for months. The question is, does Google see /tag/ on WordPress as duplicate content? SEOMoz says it's duplicate content but it's a tag. It's not really content per say. I'm all for optimizing my site but Google is not penalizing me for /tag/ results. I don't want to block /tag/ if Google is not seeing it as duplicate content for only one reason and that's because I have many results in the top 10 on G. So, can someone who knows more about this weigh in on the subject for I really would like a accurate answer. Thanks in advance...
Technical SEO | | MyAllenMedia0 -
Jquery image - hidden text?
I'm working on a site that has a jquery image rotation. Wondering if how the developer set it up is consider blackhat or spammy at all. The jquery has 3 images that rotate. Each image has text in it - which is then placed in a H1 or H2 tag behind the images. When viewing the site with images and javascript turned off it looks like the text is the same color as the background and not lined up nicely so that it is visible to anyone who has images and javascript turned off. Let me know if this is a bad practice. If so, what is the best practice to handle this? If the text were another color and aligned neatly visible behind the image would it be safe? Should we just be using an image alt tag instead? What about losing the H1, H2 power then? Any other suggestions for improving the SEO for jquery image rotations where important text appears? I can PM the site URL if you want to take a look. Thanks in advance!
Technical SEO | | IvieDigital0