Googlebot soon to be executing javascript - Should I change my robots.txt?
-
This question came to mind as I was pursuing an unrelated issue and reviewing a site's robots/txt file.
Currently this is a line item in the file:
Disallow: https://* According to a recent post in the Google Webmasters Central Blog: [http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html](http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html "Understanding Web Pages Better") Googlebot is getting much closer to being able to properly render javascript. Pardon some ignorance on my part because I am not a developer, but wouldn't this require Googlebot be able to execute javascript? If so, I am concerned that disallowing Googlebot from the https:// versions of our pages could interfere with crawling and indexation because as soon as an end-user clicks the "checkout" button on our view cart page, everything on the site flips to https:// - If this were disallowed then would Googlebot stop crawling at that point and simply leave because all pages were now https:// ??? Or am I just waaayyyy over thinking it?...wouldn't be the first time! Thanks all! [](http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html "Understanding Web Pages Better")
-
Excellent answer. Thanks so much Doug. I really appreciate it! Adding a "nofollow" attribute to the Checkout button is a good suggestion and should be fairly easy to implement. I realize that internal nofollows are not normally recommended, but in this instance, may not be a bad idea.
-
Hi Dana,
When you click on the checkout button - what's the mechanism for taking people to the https:// site. Is it just that the checkout link uses https:// in it's link? Is there some javascript wizardry you're particularly concerned about?
Even though googlebot follows this one link to the https version of the cart, it will still have all the other links on the previous page queued up to follow (non-https) so I don't think this will stop the crawl at that point. It would be a nightmare if googlebot stopped crawling hte entire site everytime it went down a rabbit hole!
That's not to say that you wouldn't want to consider no-following your checkout button. I'm sure neither you, nor google want to the innards of the cart pages to be indexed? There's probably other pages you'd rather Googlebot spent it's time finding right?
My take on the Google blog about understanding Javascript is that the aim is to try and do a better job discovering content that might be hidden by Javascript/Ajax. It's a problem for google when the raw html that they're crawling doesn't accurately reflect the content that is displayed in front of a real visitor.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Have you ever changed the logo anchor text from "logo" to "keyword"? How Google considers?
Hi all, We know that generally logo with the website homepage link is the first link crawled by Google and other search engines. Can we change the anchor text from "logo" to "keyword"? Have any one tried or seen others doing? How Google considers it? Thanks
Algorithm Updates | | vtmoz1 -
Have you ever seen or experienced a page indexed which is actually from a website which is blocked by robots.txt?
Hi all, We use robots file and meta robots tags for blocking website or website pages to block bots from crawling. Mostly robots.txt will be used for website and expect all the pages to not getting indexed. But there is a condition here that any page from website can be indexed by Google even the site is blocked from robots.txt; because crawler may find the page link somewhere on internet as stated here at last paragraph. I wonder if this really the case where some webpages have got indexed. And even we use meta tags at page level; do we need to block from robots.txt file? Can we use both techniques at a time? Thanks
Algorithm Updates | | vtmoz0 -
Adding the link masking directory to robots.txt?
Hey guys, Just want to know if you have any experience with this. Is it worthwhile blocking search engines from following the link masking directory.. (what i mean by this is the directory that holds the link redirectors to an affiliate site: example:
Algorithm Updates | | irdeto
mydomain.com/go/thislink goes to
amazon.com/affiliatelink I want to know if blocking the 'go' directory from getting crawled in robots.txt is a good idea or a bad idea? I am not using wordpress but rather a custom built php site where i need to manually decide on these things. i want to specifically know if this in any way violates guidelines for google. it doesn't change the custom experience because they know exactly where they will end up if they click on the link. any advice would be much appreciated.0 -
How to keep damage low on Google after the change of URL's
Hi Peeps, Hope someone can shed a light on this and show a guidance if possible. We are going to move our sites to shopify and shopify's URL's cannot be customized to match exactly like our current URLs. What steps do I need to take so google knows the URL's are changed. Domain will be the same. Thank you in advanced.
Algorithm Updates | | cemalcebi0 -
Webmaster Guidelines Change History
Does any one have the dates of changes to Googles Webmaster Guidelines?
Algorithm Updates | | MiroAsh0 -
Why is Google changing my title tags?
I have a few sites set up this way with their title tags: "Keyword rich phrase(s) | Company name" and Google is showing more and more of them like this in the SERPs - "Company name: Keyword rich phrase(s)" I don't see this happening to many other sites...am I hallucinating or what's going on here? Is this happening to anyone else? I don't see it necessarily affecting rankings, but for my sites with little brand recognition I want those keywords first. Bueller? Bueller?
Algorithm Updates | | NetvantageMarketing0 -
Title of home page is changed to domain name in SERPs
Hi, We have a unique problem, we are getting a totally different title in Google serps for a large site. When we search with domain name with space in google.com. We are getting title as domain name with space. We don't have any Open Directory listing. We don't have any cannonical issues and other pages with title as domain name. Can you please tell us what we have to do get our original title back in SERP ? Thanks, With Regards,
Algorithm Updates | | semshah1430 -
Did google change their algorithm over the past week?
I did some home page optimization with the seo moz on page key word optimization tool and we are now back in the top three in the past week (after dropping to page 3 a month or so ago). It seems that google has gone back to combining google places with organic searches. Has anyone else noticed this type of change? I did read some posts about panda 2.2, which seems to explain some of these findings. I am wondering if things are in flux or they may be more stable this way? Thanks for the insights.
Algorithm Updates | | fertilityhealth0