Crawling image folders / crawl allowance
-
We recently removed /img and /imgp from our robots.txt file thus allowing googlebot to crawl our image folders. Not sure why we had these blocked in the first place, but we opened them up in response to an email from Google Product Search about not being able to crawl images - which can/has hurt our traffic from Google Shopping.
My question is: will allowing Google to crawl our image files eat up our 'crawl allowance'? We wouldn't want Google to not crawl/index certain pages, and ding our organic traffic, because more of our allotted crawl bandwidth is getting chewed up crawling image files.
Outside of the non-detailed crawl stat graphs from Webmaster Tools, what's the best way to check how frequently/ deeply our site is getting crawled?
Thanks all!
-
I did this accidentally as well recently and had 100% of my products disallowed from google shopping within 48 hours. Sounds like it's not an option. They need the crawl your images folder to make sure you have valid images in you product listings.
-
if your rankings are improving, then good move!
-
Hey Richard,
We were previously blocking googlebot from crawling our images at all (through disallowing /img/ and /imgp/ in robots.txt file. We removed this block after recieving this email from Google:
Thank you for participating in Google Product Search. It has come to our attention that a robots.txt file is preventing us from crawling some or all of the images on your site. In order for us to access and display the images you provide in your product listings, we'd like you to modify your robots.txt file to allow user-agent 'googlebot' to crawl your site.
_Failure for Google to access your images may affect the visibility of your items on Google Product Search and Product Ad results. _
While I totally agree that image traffic will not convert like standard traffic, it is free and who knows, we may just pick up a few sales from it. Of course if this comes at the cost of eating up a disproportionate amount of our crawl allowance relative to the value (or avoiding any penalties from Google Product Search) we'd be better off leaving the block on.
By way of an update, it looks like our rankings have started to improve in Google product search. We first experienced a drop in rankings and traffic from Product Search on 4/16 and removed the block from robots.txt on 4/22.
-
Why do you need Google to reach inside your img folder? Images display on the page and are indexed then. Sure, if you are selling images, then I can see the need for this, but to just crawl the img folder??
If it is not huge, I do not see it penalizing you. I would make sure all images are named using keywords as crawling pic001.jpg, pic002.jpg, product01.jpg, logo.gif will not do you any good anyway.
Also I find bad linking coming from Google image searches. No one searches to purchase a coffee cup and looks in Google images to do so. Conversely, if someone is searching images of coffee cups to use in whatever, having them click over to your site is a waste of time. They are just going to grab the image and go leaving your metrics a mess.
I hope that helps.
-
It may effect crawl allowance but depends on the size of your site, page rank and trust etc.
One of the best ways to determine crawl depth and whether you have any issues is to create separate sitemaps for your most important content or areas of your site. You could also create an image sitemap.
Then you can monitor these over time and and will give you a good picture of which content is being crawled and indexed well and which content/images are not. This may also help you to find out if the site structure is too deep or whether you need to link more to deeper content in order to improve crawling and indexation.
Hope this helps.
-
Personally, I wouldn't try to figure out the impact by looking at crawl stats. I'd be more focused on end results. Have we had an increase in organic traffic, or conversions from Google shopping since we opened it up, or has either of these gone down?
That's what matters, and is the only real indicator as to whether it was a wise move or not.
-
You could check your server stats on who is accessing your site, this should tell you what bots are going to your pages when. I don't know what control panel you are using for your site, but if you are using Cpanel, I am sure there are tutorials online to help you find this information.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Accessibility / display none
Hello, Does anyone ever had a problem with display none portion of a page made for accessibility? (Jaws reader/ NVDA) Thank You.
Technical SEO | | Vale70 -
Old / passed productpages
Hi, what shall i do with the old / passed productpages? The situation right now is, that all product url's are being saved wich cost me a lot of duplicate content and lot's of url's being indexed and crawled and no use to the website at all. As there are affiliate links to those passed url's i want to give them a 301 to the category page wicht fitst best for a time of period? Do you think that's the best solution? And what time of period shall ik give those 301 redirects? Thnx and grtz, Leonie
Technical SEO | | Leonie-Kramer0 -
I have 3500 pages crawled by Google, - why is SEOMOZ only able to crawl 400 of these ?
I added my site almost two weeks ago to the PRO DashBoard, and so far only 404 pages has been crawled, - but I know for a fact that there is 3500 pages that should be crawled. Other search engines has no problem in crawling and indexing these pages, so what can be wrong here ?
Technical SEO | | haybob270 -
GWT crawl errors: How big a ranking issue?
For family reasons (child to look after) I can't keep a close eye on my SEO and SERPs. But from top 10 rankings in January for a dozen keywords I'm now not in top 80 results -- save one keyword for which I'm ~18-20.
Technical SEO | | Jeepster
Not a sitewide penalty: some of my internal pages are still ranking top 3 or so. In GWT, late March I received warning of a rise in server errors:
17 Server Errors/575 soft 404s/17 Not Founds/Access Denied 1/Others 4
I've also got 2 very old sitemaps (from two different ex-SEO firms) & I'm guessing about 75% of the links on there no longer exist. Q: Could all this be behind my calamitous SERPS drop? Or should I be devoting my -- limited -- time to improving my links?0 -
Crawl Diagnostic: Notices about 301 redirects
There are detected five 301 redirects on my site and I want to understand why this is happening? And is this important to fix? http://domain.cl/subfolder ---- redirects to ----> http://domain.cl/subfolder/ What does this tell me "/" I am very curious 🙂 Thanks for every answer
Technical SEO | | inlinear
Holger0 -
Use of Location Folders
I'd like to understand the pro's and con's of using a location subfolder as an SEO strategy (example: http://sqmedia.us/Dallas/content-marketing.html), where the /Dallas folder is holding all of my keyword rich page titles. The strategy is to get local-SEO benefits from the use of the folder titled /Dallas (a folder which is unnecessary in the over all structure of this site), but how much is this strategy taking away from the page-title keyword effectiveness?
Technical SEO | | sqmedia0 -
To 301 redirect or not to 301 redirect? duplicate content problem www.domain.com and www.domain.com/en/
Hello, If your website is getting flagged for duplicate content from your main domain www.domain.com and your multilingual english domain www.domain.com/en/ is it wise to 301 redirect the english multilingual website to the main site? Please advise. We've recently installed the joomish component to one of our joomla websites in an effort to streamline a spanish translation of the website. The translation was a success and the new spanish webpages were indexed but unfortunately one of the web developers enabled the english part of the component and some english webpages were also indexed under the multilingual english domain www.domain.com/en/ and that flagged us for duplicate content. I added a 301 redirect to redirect all visitors from the www.domain/en/ webpages to the main www.domain.com/ webpages. But is that the proper way of handling this problem? Please advise.
Technical SEO | | Chris-CA0 -
Massive Rank Fluctuation in Bing/Yahoo
About two months ago we dealt with some issues that were causing weak rankings in Bing and Yahoo! since then, we've seen rankings rise and plummet dramatically with no changes to the site. On Bing and Yahoo! we've gone from 1st page to less than 50th and back again THREE times. This happens across all keywords, even branded ones that are an exact-match for the domain. Google rankings have continued to rise steadily. We benefited greatly from Penguin, which tanked a number of competitors. Bing webmaster tools aren't showing any server or other kinds of errors during the rank drops. Has anyone every experienced this before?
Technical SEO | | BedeFahey0