Googlebot soon to be executing javascript - Should I change my robots.txt?
-
This question came to mind as I was pursuing an unrelated issue and reviewing a site's robots/txt file.
Currently this is a line item in the file:
Disallow: https://* According to a recent post in the Google Webmasters Central Blog: [http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html](http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html "Understanding Web Pages Better") Googlebot is getting much closer to being able to properly render javascript. Pardon some ignorance on my part because I am not a developer, but wouldn't this require Googlebot be able to execute javascript? If so, I am concerned that disallowing Googlebot from the https:// versions of our pages could interfere with crawling and indexation because as soon as an end-user clicks the "checkout" button on our view cart page, everything on the site flips to https:// - If this were disallowed then would Googlebot stop crawling at that point and simply leave because all pages were now https:// ??? Or am I just waaayyyy over thinking it?...wouldn't be the first time! Thanks all! [](http://googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html "Understanding Web Pages Better")
-
Excellent answer. Thanks so much Doug. I really appreciate it! Adding a "nofollow" attribute to the Checkout button is a good suggestion and should be fairly easy to implement. I realize that internal nofollows are not normally recommended, but in this instance, may not be a bad idea.
-
Hi Dana,
When you click on the checkout button - what's the mechanism for taking people to the https:// site. Is it just that the checkout link uses https:// in it's link? Is there some javascript wizardry you're particularly concerned about?
Even though googlebot follows this one link to the https version of the cart, it will still have all the other links on the previous page queued up to follow (non-https) so I don't think this will stop the crawl at that point. It would be a nightmare if googlebot stopped crawling hte entire site everytime it went down a rabbit hole!
That's not to say that you wouldn't want to consider no-following your checkout button. I'm sure neither you, nor google want to the innards of the cart pages to be indexed? There's probably other pages you'd rather Googlebot spent it's time finding right?
My take on the Google blog about understanding Javascript is that the aim is to try and do a better job discovering content that might be hidden by Javascript/Ajax. It's a problem for google when the raw html that they're crawling doesn't accurately reflect the content that is displayed in front of a real visitor.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Adding non-important folders to disallow in robots.txt file
Hi all, If we have many non-important folders like /category/ in blog.....these will multiply the links. These are strictly for users who access very rarely but not for bots. Can we add such to disallow list in robots to stop link juice passing from them, so internal linking will me minimised to an extent. Can we add any such paths or pages in disallow list? Is this going to work pure technical or any penalty? Thanks, Satish
Algorithm Updates | | vtmoz0 -
What is the appropriate Robot.txt to unblock if Google cannot get all the resources from my homepage?
Hello everyone. I did some research as to why my website has decreased in the Google search rankings recently. After reading this Yoast article I believe it's because the robot.txt files I have set up on my wordpress website. The following is a screen shot of the results of a fetch & render query of my webpage.Googlebot couldn't get all resources for this page. Here's a list: URL Type Reason
Algorithm Updates | | gamesotdhttp://fonts.googleapis.com/css?family=Open+Sans:400,600,700,800%7CPT+Sans:400,400italic,700,700italic%7COswald:400,300,700&subset=latin,latin-ext Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/slick-contact-forms/css/admin.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/contact-form-plugin/css/style.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/hupso-share-buttons-for-twitter-facebook-google/style.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/latest-post-accordian-slider/css/lpaccordion.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/latest-post-accordian-slider/css/style.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/revslider/rs-plugin/css/settings.css?rev=4.1.1&ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/revslider/rs-plugin/css/dynamic-captions.css?rev=4.1.1&ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/revslider/rs-plugin/css/static-captions.css?rev=4.1.1&ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/wp-email-capture/inc/css/wp-email-capture-styles.css?ver=1.0 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/themes/infographer/style.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/themes/infographer/css/stylesheet.min.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/themes/infographer/css/style_dynamic.php?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/themes/infographer/css/custom_css.php?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/convertable-contact-form-builder-analytics-and-lead-management-dashboard/assets/css/convertable.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/google-maps-widget/css/gmw.css?ver=1.66 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-content/plugins/acurax-social-media-widget/style.css?ver=3.9.1 Style Sheet Denied by robots.txt
http://www.kmollinslaw.com/wp-includes/js/swfobject.js?ver=2.2-20120417 Script Denied by robots.txt My current robot.txt settings are as follows; User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: */xmlrpc.php Disallow: */wp-*.php Disallow: */trackback/ Disallow: *?wptheme= Disallow: *?comments= Disallow: *?replytocom Disallow: */comment-page- Disallow: *?s= Disallow: */wp-content/ Allow: */wp-content/uploads/ ```What to I need to allow/disallow to allow Google spiders to properly read my website?
0 -
Impact of recent On Page Optimisation changes had negative impact !
Hi I recently updated some page titles, H1 tags & on page content which overall has seen search results slip down following the first site crawl by google I assume. My question is, should I try to get back the rankings and test and change one thing at a time to see the impact right now or should i wait for a period of time for it to settle down once goggle has crawled the site a few times or will the subsequent crawls have no impact? Thanks Ash
Algorithm Updates | | AshShep10 -
Question about Google Algo Change on June 26
I have a client who's Google Organic visits dropped significantly on June 26th. I used a chart overlay called ChartIntelligence. It says that there was an SEOF update on 6/26/2013. Does anyone know what this update (or any other updates) would be? Also, where might I find additional info on this update. I did notice that Moz's algo change tracker listed a multi-week update on June 27, but I'm not sure where to find info on what types of things were impacted by this update. Any info would be helpful.
Algorithm Updates | | TopFloor0 -
Google Unable to Access Robots.txt
We haven't made any changes to the robots.txt file and suddenly Google claims they can no longer access the file. The site has been up and active for well over a year now. What are my next steps? I have included a screenshot of the top half of the file. See anything wrong? D3H5tgE.png
Algorithm Updates | | rhoadesjohn0 -
Will Ranking Reports be Affected with the new Google Changes?
For example: Raven stopped use of scraped Google, SEMRush data on Jan. 2 Raven stopped offering unauthorized Google SERP rankings and keyword data (a.k.a. scraped Google data) on Jan. 2, 2013. The change included the retirement of the SERP Tracker and the elimination of SEMRush data from the Raven platform. Raven has released new SEO performance reports that make it easy to show clients the impact of campaigns to improve organic traffic. Raven will continue to upgrade reports through the year. We thank the many customers who continue their business with Raven. More details about the SEO performance reports and other recent releases are available Is SEOMoz protected in some way? Or will you have to give up rankings reports too?
Algorithm Updates | | MSWD0 -
Did the Bing/Yahoo deal change?
I just went to my campaign to check on rankings, and my Bing rankings are wildly different from my Yahoo rankings. None of the keywords are even close, every one is very different. Anyone know what is happening here? Shouldn't they still be the same?
Algorithm Updates | | DanDeceuster0 -
Google changing the casing in SERPs of our domain name in Title tag!
I've added NOODP and NOYDIR metas to our pages... but Google is still somehow showing the correct title tag that is on the page, but is changing the CASING of the | Domain.com portion. In some instances, they are still showing a different title tag all together. Why would they be ignoring the <title>tag on the page and placing an uncased version of our domain name at the end?</p> <p> </p> <a download="MxQjo" class="imported-anchor-tag" href="http://imgur.com/MxQjo" target="_blank">MxQjo</a></title>
Algorithm Updates | | CareerBliss0