What's going on with google index - javascript and google bot
-
Hi all,
Weird issue with one of my websites.
The website URL: http://www.athletictrainers.myindustrytracker.com/
Let's take 2 diffrenet article pages from this website:
1st: http://www.athletictrainers.myindustrytracker.com/en/article/71232/
As you can see the page is indexed correctly on google:
http://webcache.googleusercontent.com/search?q=cache:dfbzhHkl5K4J:www.athletictrainers.myindustrytracker.com/en/article/71232/10-minute-core-and-cardio&hl=en&strip=1 (that the "text only" version, indexed on May 19th)
2nd: http://www.athletictrainers.myindustrytracker.com/en/article/69811
As you can see the page isn't indexed correctly on google:
http://webcache.googleusercontent.com/search?q=cache:KeU6-oViFkgJ:www.athletictrainers.myindustrytracker.com/en/article/69811&hl=en&strip=1 (that the "text only" version, indexed on May 21th)
They both have the same code, and about the dates, there are pages that indexed before the 19th and they also problematic. Google can't read the content, he can read it when he wants to.
Can you think what is the problem with that? I know that google can read JS and crawl our pages correctly, but it happens only with few pages and not all of them (as you can see above).
-
Hello Or,
I just checked the most recent cache and it looks like Google does NOT see the content on the first URL (ending in /71232/) but does see it on the second one (ending in 69811).
This is the opposite of the situation you described above.
Yes, Google "can" execute Javascript, but just because they can doesn't mean they will every time. Also, perhaps not all of their bots can or do execute Javascript every time. For instance, the bot they use for pure discovery may not, while the one they use to render previews may.
Or they could have given the Javascript only so long to execute.
I also notice the page that is currently not indexed fully has an embedded YouTube video. Not that this would typically cause any problems with getting other content indexed, in your case it may be worth looking into. For example, it could contribute to the load time issue mentioned above.
When it comes to executing scripts, submitting forms, etc... Google is very much at the stage of just randomly "trying stuff out" to "see what happens". It's like a hyperactive baby in a spaceship just pushing buttons like crazy, which is why we run into issues with "spider traps" and with unintentionally getting dynamic pages indexed from form submissions, internal searches and other oddities in site architecture. It is also one of the reasons why markup like Schema.org and JSON-LD are important: They allow us to label the buttons so the bot "understands" what it is pressing (or not).
I apologize that there is not definitive answer for your problem at the moment, but given the behavior has switched completely I'm not sure how to go about investigating. This is why it is still very much a best practice to ensure all of your content is indexable by not rendering it with Javascript. If you can't see the textual content in the source code (as is the case here) then you are at risk of it not being seen by Google.
-
Hi Patrick,
We already tested all the pages with fetch as Google tool, sorry that I didn't mention is before but everything over there is ok. I see the 'Partial" status, but the issues are with one of the social plugins and without any connection to the content.
So, all the tools show that it should be ok, but google isn't indexing correctly the pages.
I already checked:
1. Frontend code.
2. No-index issues
3. Canonical issues
4. Robots.txt issues
5. Fetch as Google issues
I know that google can read JS, and I don't understand why he can read only part of the pages and not all of them (there isn't any difference between them).
-
Hi there
I would take a look at the Fetch as Google tool in your Search Console and see what issues arise there - I would do this for both your desktop and your mobile, so that you can see how these pages are being rendered by Google.
If you get a "Partial" status, Google will return the issues that they have ran into, and you can prioritize your issues & how you want to handle them from there.
You can read more about Javascript and Google here as well as here.
Hope this all helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blog page won't get indexed
Hi Guys, I'm currently asked to work on a website. I noticed that the blog posts won't get indexed in Google. www.domain.com/blog does get indexed but the blogposts itself won't. They have been online for over 2 months now. I found this in the robots.txt file: Allow: / Disallow: /kitchenhandle/ Disallow: /blog/comments/ Disallow: /blog/author/ Disallow: /blog/homepage/feed/ I'm guessing that the last line causes this issue. Does anyone have an idea if this is the case and why they would include this in the robots.txt? Cheers!
Technical SEO | | Happy-SEO2 -
My website's pages are not being indexed correctly
Hi, One of our websites, which is actually a price comparison engine, facing indexing problem at Google. When we check “site:mywebsite.com “, there are lots of pages indexed which are not from mywebsite.com but from merchants websites. The index result page also shows merchant’s page title. In some cases the title is from merchant’s site but when the given link is accessed it points to mywebsite.com/index. Also the cache displays the merchant’s product page as the last indexed version rather than showing ours. The mywebsite.com has quite few Merchants that send us their product feed. Those products are listed on comparison page with prices. The merchant’s links on comparison page are all no-follow links but some of the (not all) merchant’s product pages are indexed against mywebsite.com as mentioned above instead of product comparison page of mywebsite.com How can we fix the issue? Thanks!
Technical SEO | | digitalMSB0 -
What's the best way to handle Overly Dynamic Url's?
So my question is What the best way to handle Overly Dynamic Url's. I am working on a real estate agency website. They are selling/buying properties and the url is as followed. ttp://www.------.com/index.php?action=calculator&popup=yes&price=195000
Technical SEO | | Angelos_Savvaidis0 -
Best Practices for adding Dynamic URL's to XML Sitemap
Hi Guys, I'm working on an ecommerce website with all the product pages using dynamic URL's (we also have a few static pages but there is no issue with them). The products are updated on the site every couple of hours (because we sell out or the special offer expires) and as a result I keep seeing heaps of 404 errors in Google Webmaster tools and am trying to avoid this (if possible). I have already created an XML sitemap for the static pages and am now looking at incorporating the dynamic product pages but am not sure what is the best approach. The URL structure for the products are as follows: http://www.xyz.com/products/product1-is-really-cool
Technical SEO | | seekjobs
http://www.xyz.com/products/product2-is-even-cooler
http://www.xyz.com/products/product3-is-the-coolest Here are 2 approaches I was considering: 1. To just include the dynamic product URLS within the same sitemap as the static URLs using just the following http://www.xyz.com/products/ - This is so spiders have access to the folder the products are in and I don't have to create an automated sitemap for all product OR 2. Create a separate automated sitemap that updates when ever a product is updated and include the change frequency to be hourly - This is so spiders always have as close to be up to date sitemap when they crawl the sitemap I look forward to hearing your thoughts, opinions, suggestions and/or previous experiences with this. Thanks heaps, LW0 -
Google Not Indexed WWW name
Here is my domain - http://www.plugnbuy.com . When i see through "site" google not showing with WWW index but the same when i do without WWW.. it is showing in search. So yesturday i changed the setting from GWM to preferred domain as a WWW appear but today still not showing anything... Please help..
Technical SEO | | mamuti0 -
On-Page Report Says 'F', and I'm Confoozled As to Why
I'm primarily interested in how we failed in our "Broad Keyword Usage in Title" category. The Keyword Pair we're gunnin' for is: "Mac Windows" Our current page title is: "CrossOver: Windows on Mac and Linux with the easiest and most affordable emulator - CodeWeavers" This is, I grant, ugly. However, bear with me. SEOMoz Report Card says "Easy Fix!" and suggests: "Employ the keyword in the page title, preferrably as the first words in the element." I humbly submit that "Mac" and "Windows" IS in the page title. So what am I missing? Is it the placement of the words relative to each other, or relative to the start of the sentence? Or is the phrase "CrossOver:" somehow blocking the rest of the sentence from being read? Are colons evil? I'm genuinely mystified as to why (from a structural standpoint) our existing title tag is failing this test, and I'd be delighted for answers and/or feedback. Thanks in advance.
Technical SEO | | CodeWeavers0 -
Google Index Speed Opinions
Hello Everyone, Under normal circumstances, new posts to my site are indexed almost instantly by Google. I know this because an occasional search with quotation marks surrounding the 1st paragraph of text displays my newly published page. I use this tactic from time to time to ensure contributors aren't syndicating content. My question is this: I've noticed over the last day or so that my newly published articles are not yet indexed. For example, an article that was published over 24 hours ago does not appear to be indexed yet. Is this cause for concern? Is there an average wait time for indexation? XML issue? Thanks in advance for the help/insight.
Technical SEO | | JSOC0 -
Google indexing directory folder listing page
Google somehow managed to find several of our images index folders and decided to include them into their index. Example: websitesite.com/category/images/ is what you'll see when doing a site:website.com search. So, I have two-part question: 1) Does this hurt our site's ability to rank in any way?
Technical SEO | | invision
Because all Google sees is just a directory listing page with a bunch of links to images in the folder. 2) If there could be any negative effect, what is the best way to get these folders out of Google's index?
I could block via robots.txt, but I'm afraid it will also block all the images in that folder from being indexed in Google image search. I could also turn off directory listing in cpanel / htaccess, but then that gives is a 403 forbidden. Will this hurt the site in anyway and would it prevent Google from indexing the images in the directory? Thanks,
Tony0