URLs with Hashtags - Does Google Index Them?
-
Hi there,
I have a potential issue with a site whereby all pages are dynamically populated using Javascript. Thus, an example of an URL on their site would be www.example.com/#!/category/product.
I have read lots of conflicting information on the web - some says Google will ignore everything after the hashtag; other people say that Google will now index everything after the hashtag.
Does anybody have any conclusive information about this? Any links to Google or Matt Cutts as confirmation would be brilliant.
P.S. I am aware about the potential issue of duplicate content, but I can assure you that has been dealt with. I am only concerned about whether Google will index full URLs that contain hashtags.
Thanks all!
Mark
-
Hi All,
It looks like Google has setup a nice dev site and FAQ page to go over the options here especially when using AJAX and hash tags to link to hidden content. https://developers.google.com/webmasters/ajax-crawling/docs/faq#whereinresults.
It looks as if Google will be able to index the content of the entire page (hidden and initially shown) and not create a separate URL if you use a ! before the #. I'd read up on that FAQ page, and play with site commands on the Google dev site.
-
Thankfully Webmaster World were able to provide some decent information, for those of you who have arrived here looking for a similar answer.
There is something called the "hash-bang" which makes javascript pages crawlable. Hashbang refers to hash (#) bang (!) - so an example would be example.com/#!/page-1.
Here's a great place to read more, understand and learn to implement:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=174992
Cheers all!
-
Here's an example of a # URL which has not been indexed.
http://dulas.org.uk/hydro-info.cfm#specification_installation
Unlike the site I am working on, this site 'hides' content from the user until they click on a particular tab. All of the original code is in the source for http://dulas.org.uk/hydro-info.cfm but only shown to the user if they activate the particular piece of javascript when they click on a tab.
The site I am working on is different - it loads content based on javascript, however it essentially loads as a new page - the content is not present in the source until you click no something, when new content will load and the old content will disappear.
Perhaps Google will be able to see that these # pages function much like a normal page, loading completely new content and getting rid of old content, and may therefore index them if I submit them in a sitemap. However, I'd like to hear from somebody who can tell me they have done this and had success!
Thanks,
Mark
-
Hi Lee,
Thanks for your response. My concern is that # URLs tend to send users to a particular location on a page, rather than a new page itself. Therefore, some things I have read suggest that Google has adapted to ignore anything after a # in order to avoid indexing an enormous amount of duplicate content. Strange that there is so much conflicting info out there!
Cheers,
Mark
-
Hi Mark, although I don't have any conclusive evidence I would say that Google does index hashtag URLS.
Think of it this way; when you link within a page using an anchor (#), Google see's the '#' and 'non-# URLS' as unique URLS so logically this does suggest that they do index the full URL.
Hooe that's helped, Lee.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index Page Redirect to Home Page? Best Practices...
Hi, I am wondering what the best practice is when a site has an index page and a home page? I have two pages, listed below, and want to know if I should 301 redirect my "index" page to my standard home page. The home page is where I would like all traffic to fall on for our website. Additionally, I used the rel=canonical tag years ago on the index page to indicate that the home page is the main content. Home Page - https://www.1099pro.com/ (PA 45) Home Page Canonical: rel="canonical" href="https://www.1099pro.com/"/> Index Page - https://www.1099pro.com/index.asp (PA - 33) Index Page Canonical: rel="canonical" href="https://www.1099pro.com/"/> It seems to me that there is some extra juice that could be passed to my home page (which is the page that ranks highly for our major keywords) by 301 redirecting the index page. Is there any reason why I should not do that? Really appreciate any help - especially with extra explanations - for the simple minded like me ;)! -Michael
Web Design | | Stew2220 -
Curious why site isn't ranking, rather seems like being penalized for duplicate content but no issues via Google Webmaster...
So we have a site ThePowerBoard.com and it has some pretty impressive links pointing back to it. It is obviously optimized for the keyword "Powerboard", but in no way is it even in the top 10 pages of Google ranking. If you site:thepowerboard.com the site, and/or Google just the URL thepowerboard.com you will see that it populates in the search results. However if you quote search just the title of the home page, you will see oddly that the domain doesn't show up rather at the bottom of the results you will see where Google places "In order to show you the most relevant results, we have omitted some entries very similar to the 7 already displayed". If you click on the link below that, then the site shows up toward the bottom of those results. Is this the case of duplicate content? Also from the developer that built the site said the following: "The domain name is www.thepowerboard.com and it is on a shared server in a folder named thehoverboard.com. This has caused issues trying to ssh into the server which forces us to ssh into it via it’s ip address rather than by domain name. So I think it may also be causing your search bot indexing problem. Again, I am only speculating at this point. The folder name difference is the only thing different between this site and any other site that we have set up." (Would this be the culprit? Looking for some expert advice as it makes no sense to us why this domain isn't ranking?
Web Design | | izepper0 -
Do you know any tool(s) to check if Google can crawl a URL?
Our site is currently blocking search bots that's why I can't use Google Webmaster Tools' URL fetch tool. In Screamingfrog, there are dynamic pages that can't be found if I crawl the homepage. Thanks in advance!
Web Design | | esiow20130 -
Duplicate Content? Designing new site, but all content got indexed on developer's sandbox
An ecommerce I'm helping is getting a complete redesign. Their developer had a sandbox version of their new site for design & testing. Several thousand products were loaded into the sandbox site. Then Google/Bing crawled and indexed the site (because developer didn't have a robots.txt), picking up and caching about 7,200 pages. There were even 2-3 orders placed on the sandbox site, so people were finding it. So what happens now?
Web Design | | trafficmotion
When the sandbox site is transferred to the final version on the proper domain, is there a duplicate content issue?
How can the developer fix this?0 -
Totally flat URL structure
Hi Mozzers! I've just been viewing a website with a flat URL structure - the site has a definite structure - with various sections - and yet the URL structure doesn't reflect this... The developer tells me this is purely for SEO purposes! Would be interested in your thoughts...
Web Design | | McTaggart0 -
Does stock art photo attribution negatively impact SEO by leaking Google Page Rank?
Greetings: Companies such as Shutterstock often require that buyers place credit attribution on their web pages when photos you buy from them appear on these pages.. Shutterstock requests that credit attribution links such as these be added: Songquan Deng / Shutterstock.com Do these links negatively impact SEO? Or do search engines view them as a positive? Thanks,
Web Design | | Kingalan1
Alan0 -
Question #1: Does Google index https:// pages? I thought they didn't because....
generally the difference between https:// and http:// is that the s (stands for secure I think) is usually reserved for payment pages, and other similar types of pages that search engines aren't supposed to index. (like any page where private data is stored) My site that all of my questions are revolving around is built with Volusion (i'm used to wordpress) and I keep finding problems like this one. The site was hardcoded to have all MENU internal links (which was 90% of our internal links) lead to **https://**www.example.com/example-page/ instead of **http://**www.example.com/example-page/ To double check that this was causing a loss in Link Juice. I jumped over to OSE. Sure enough, the internal links were not being indexed, only the links that were manually created and set to NOT include the httpS:// were being indexed. So if OSE wasn't counting the links, and based on the general ideology behind secure http access, that would infer that no link juice is being passed... Right?? Thanks for your time. Screens are available if necessary, but the OSE has already been updated since then and the new internal links ARE STILL NOT being indexed. The problem is.. is this a volusion problem? Should I switch to Wordpress? here's the site URL (please excuse the design, it's pretty ugly considering how basic volusion is compared to wordpress) http://www.uncommonthread.com/
Web Design | | TylerAbernethy0 -
Site lost in Google
Recently I switched my from Joomla to WordPress. I did 301 re-directs ...and promptly fell from Google results after 5 YEARS at no. 1-3 Still ranking 1 in Bing and 3 in Yahoo...but nowhere near the traffic I previously had. Here is my site: http://selfdirectedira.org Any suggestions? TIA
Web Design | | tvegas0