Googlebots and cache
-
Our site checks whether visitors are resident in the same country or live abroad.
If it recognises that the visitor comes from abroad, the content is made more appropriate for them. Basically, instead of encouraging the visitor to come and visit a showroom, it tells them that we export worldwide. It does this by IP checking.
So far so good! But I noticed that if I look at cached pages in Google's results, that the cached pages are all export pages. I've also used Google Webmaster Tools (Search Console) and rendered pages as Google - and they also render export pages.
Does anybody have a solution to this?
Is it a problem?
Can Google see the properly (local - as in UK) version of the site? -
Google won't see the local version (I assume your site is UK based) - Googlebot is visiting with an IP from California & will see the "international" version of your site. They indicate that they have bots visiting the site from other IP addresses (local aware crawling) - but to be honest, if I check the server logs of our sites (based in FR & ES) I only find visits from US IP's.
If the international version has only minor differences to the local version it shouldn't be a major problem - if they are major differences it's probably better to find another solution. This could be creating a different version of your site (which could be overkill), or presenting international visitors the choice on first visit (local version/international version). You store the choice in a cookie & personalise the content on the pages based on the cookie value. This way, Google would see the "local" version of the site.
Hope this helps,
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long after disallowing Googlebot from crawling a domain until those pages drop out of their index?
We recently had Google crawl a version of the site we that we had thought we had disallowed already. We have corrected the issue of them crawling the site, but pages from that version are still appearing in the search results (the version we want them to not index and serve up is our .us domain which should have been blocked to them). My question is this: How long should I expect that domain (the .us we don't want to appear) to stay in their index after disallowing their bot? Is this a matter of days, weeks, or months?
Technical SEO | | TLM0 -
GWT False Reporting or GoogleBot has weird crawling ability?
Hi I hope someone can help me. I have launched a new website and trying hard to make everything perfect. I have been using Google Webmaster Tools (GWT) to ensure everything is as it should be but the crawl errors being reported do not match my site. I mark them as fixed and then check again the next day and it reports the same or similar errors again the next day. Example: http://www.mydomain.com/category/article/ (this would be a correct structure for the site). GWT reports: http://www.mydomain.com/category/article/category/article/ 404 (It does not exist, never has and never will) I have been to the pages listed to be linking to this page and it does not have the links in this manner. I have checked the page source code and all links from the given pages are correct structure and it is impossible to replicate this type of crawl. This happens accross most of the site, I have a few hundred pages all ending in a trailing slash and most pages of the site are reported in this manner making it look like I have close to 1000, 404 errors when I am not able to replicate this crawl using many different methods. The site is using a htacess file with redirects and a rewrite condition. Rewrite Condition: Need to redirect when no trailing slash RewriteCond %{REQUEST_FILENAME} !-f
Technical SEO | | baldnut
RewriteCond %{REQUEST_FILENAME} !.(html|shtml)$
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ /$1/ [L,R=301] The above condition forces the trailing slash on folders. Then we are using redirects in this manner: Redirect 301 /article.html http://www.domain.com/article/ In addition to the above we had a development site whilst I was building the new site which was http://dev.slimandsave.co.uk now this had been spidered without my knowledge until it was too late. So when I put the site live I left the development domain in place (http://dev.domain.com) and redirected it like so: <ifmodule mod_rewrite.c="">RewriteEngine on
RewriteRule ^ - [E=protossl]
RewriteCond %{HTTPS} on
RewriteRule ^ - [E=protossl:s] RewriteRule ^ http%{ENV:protossl}://www.domain.com%{REQUEST_URI} [L,R=301]</ifmodule> Is there anything that I have done that would cause this type of redirect 'loop' ? Any help greatly appreciated.\0 -
Why is Google's cache preview showing different version of webpage (i.e. not displaying content)
My URL is: http://www.fslocal.comRecently, we discovered Google's cached snapshots of our business listings look different from what's displayed to users. The main issue? Our content isn't displayed in cached results (although while the content isn't visible on the front-end of cached pages, the text can be found when you view the page source of that cached result).These listings are structured so everything is coded and contained within 1 page (e.g. http://www.fslocal.com/toronto/auto-vault-canada/). But even though the URL stays the same, we've created separate "pages" of content (e.g. "About," "Additional Info," "Contact," etc.) for each listing, and only 1 "page" of content will ever be displayed to the user at a time. This is controlled by JavaScript and using display:none in CSS. Why do our cached results look different? Why would our content not show up in Google's cache preview, even though the text can be found in the page source? Does it have to do with the way we're using display:none? Are there negative SEO effects with regards to how we're using it (i.e. we're employing it strictly for aesthetics, but is it possible Google thinks we're trying to hide text)? Google's Technical Guidelines recommends against using "fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash." If we were to separate those business listing "pages" into actual separate URLs (e.g. http://www.fslocal.com/toronto/auto-vault-canada/contact/ would be the "Contact" page), and employ static HTML code instead of complicated JavaScript, would that solve the problem? Any insight would be greatly appreciated.Thanks!
Technical SEO | | fslocal0 -
Empty Google cached pages.
My little startup Voyage has a tough relationship with Google. I have been reading SEOMOZ/MOZ for years. I am no pro but I understand the basics pretty well. I would like to know why all pages on my main domain look empty in google cache. Here is one example. Other advice is welcome too. I know a lot of my metas and my markup is bad but I am working on it!
Technical SEO | | vincentgagne0 -
Google showing a Cached option but then giving a 404
2 weeks ago my home page plus some others had a 301 redirect to another domain for about 1 week (due to a hack).The original pages were then de-indexed and the new bad domain was indexed and in effect stole my rankings.Then the 301 was removed/cleaned from my domain and the bad domain was fully de-indexed via a request I made (this was 1 week ago).Then my pages came back into the index but without any ranking power.Now when I perform a search for my domain my home page is listed with an option to view the Cache. Clicking on the Cache brings up a 404 error.So why is Google showing the Cached option but doesn't have the cached file? How do I get Google to properly update it's Cache or show a cached copy?
Technical SEO | | Dantek0 -
My beta site (beta.website.com) has been inadvertently indexed. Its cached pages are taking traffic away from our real website (website.com). Should I just "NO INDEX" the entire beta site and if so, what's the best way to do this? Please advise.
My beta site (beta.website.com) has been inadvertently indexed. Its cached pages are taking traffic away from our real website (website.com). Should I just "NO INDEX" the entire beta site and if so, what's the best way to do this? Are there any other precautions I should be taking? Please advise.
Technical SEO | | BVREID0 -
Google caching the "cookie law message"
Hello! So i've been looking at the cached text version of our website. (Google Eyes is a great add on for this) One thing I've noticed is that, Google caches our EU Cookie Law message. The message appears on the top of the page and Google is caching this. The message is enclosed within and but it still is being cached. I'm going to ask the development mean to move the message at the bottom of the page and fix the position, but reviewing other websites with cookie messages, Google isn't caching them in their text only versions. Any tips or advice?
Technical SEO | | Bio-RadAbs0 -
Blocking URL's with specific parameters from Googlebot
Hi, I've discovered that Googlebot's are voting on products listed on our website and as a result are creating negative ratings by placing votes from 1 to 5 for every product. The voting function is handled using Javascript, as shown below, and the script prevents multiple votes so most products end up with a vote of 1, which translates to "poor". How do I go about using robots.txt to block a URL with specific parameters only? I'm worried that I might end up blocking the whole product listing, which would result in de-listing from Google and the loss of many highly ranked pages. DON'T want to block: http://www.mysite.com/product.php?productid=1234 WANT to block: http://www.mysite.com/product.php?mode=vote&productid=1234&vote=2 Javacript button code: onclick="javascript: document.voteform.submit();" Thanks in advance for any advice given. Regards,
Technical SEO | | aethereal
Asim0