Can I, in Google's good graces, check for Googlebot to turn on/off tracking parameters in URLs?
-
Basically, we use a number of parameters in our URLs for event tracking. Google could be crawling an infinite number of these URLs. I'm already using the canonical tag to point at the non-tracking versions of those URLs....that doesn't stop the crawling tho.
I want to know if I can do conditional 301s or just detect the user agent as a way to know when to NOT append those parameters.
Just trying to follow their guidelines about allowing bots to crawl w/out things like sessionID...but they don't tell you HOW to do this.
Thanks!
-
No problem Ashley!
It sounds like that would fall under cloaking, albeit pretty benign as far as cloaking goes. There's some more info here. The Matt Cutts video on that page has a lot of good information. Apparently any cloaking is against Google's guidelines. I would suspect you could get away with it, but I'd be worried everyday about a Google penalty getting handed down.
-
The syntax is correct. Assuming the site: and inurl: operators work in Bing, as they do in Google, then Bing is not indexing URLs with the parameters.
That article you've referred to only tells how to sniff out Google...one of a couple. What it doesn't tell me, unfortunately, is if there are any consequences of doing so and taking some kind of action...like shutting off the event tracking parameters in this case.
Just to be clear...thanks a bunch for helping out!
-
My sense from what you told me is that canonicals should be working in your case. What you're trying to use them for is what they're intended to do. You're sure the syntax is correct, and they're in the of the page or being set in the HTTP header?
Google does set it up so you can sniff out Googlebot and return different content (see here), but that would be unusual to do given the circumstances. I doubt you'd get penalized for cloaking for redirecting parameterized URLs to canonical ones for only Googlebot, but I'd still be nervous about doing it.
Just curious, is Bing respecting the canonicals?
-
Yeah, we can't noindex anything because there literally is NO way to crawl the site without picking up tracking parameters.
So we're saying that there is literally no good/approved way to say "oh look, it's google. let's make sure we don't put any of these params on the URL."? Is that the consensus?
-
If these duplicate pages have URLs that are appearing in search results, then the canonicals aren't working or Google just hasn't tried to reindex those pages yet. If the pages are duplicates, and you've set the canonical correctly, and entered them in Google Webmaster Tools, over time those pages should drop out of the index as Google reindexes them. You could try submitting a few of these URLs with parameters to Google to reindex manually in Google Webmaster Tools, and see if afterward they disappear from the results pages. If they do, then it's just a matter of waiting for Googlebot to find them all.
If that doesn't work, you could try something tricky like adding meta noindex tags to the pages with URL parameters, wait until they fall out of the index, and then add canonical tags back on, and see if those pages come back into the SERPs. If they do, then Google is ignoring your canonical tags. I hate to temporarily noindex any pages like this... but if they're all appearing separately in the SERPs anyhow, then they're not pooling their link juice properly anyway.
-
Thank you for your response. Even if I tell them that the parameters don't alter content, which I have, that doesn't stop how many pages google has to crawl. That's my main concern...that googlebot is spending too much time on these alternate URLs.
Plus there are millions of these param-laden URLs in the index, regardless of the canonical tag. There is currently no way for google to crawl the site without parameters that change constantly throughout each visit. This can't be optimal.
-
You're doing the right thing by adding canonicals to those pages. You can also go into Google Webmaster Tools and let them know that those URL parameters don't change the content of the pages. This really is the bread and butter of canonical tags. This is the problem they're supposed to solve.
I wouldn't sniff out Googlebot just to 301 those URLs with parameters to the canonical versions. The canonicals should be sufficient. If you do want to sniff out Googlebot, Google's directions are here. You don't do it by user agent, you do a reverse DNS lookup. Again, I would not do this in your case.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I'm noticing that URL that were once indexed by Google are suddenly getting dropped without any error messages in Webmasters Tools, has anyone seen issues like this before?
I'm noticing that URLs that were once indexed by Google are suddenly getting dropped without any error messages in Webmasters Tools, has anyone seen issues like this before? Here's an example:
Intermediate & Advanced SEO | | nystromandy
http://www.thefader.com/2017/01/11/the-carter-documentary-lil-wayne-black-lives-matter0 -
Should I disallow all URL query strings/parameters in Robots.txt?
Webmaster Tools correctly identifies the query strings/parameters used in my URLs, but still reports duplicate title tags and meta descriptions for the original URL and the versions with parameters. For example, Webmaster Tools would report duplicates for the following URLs, despite it correctly identifying the "cat_id" and "kw" parameters: /Mulligan-Practitioner-CD-ROM
Intermediate & Advanced SEO | | jmorehouse
/Mulligan-Practitioner-CD-ROM?cat_id=87
/Mulligan-Practitioner-CD-ROM?kw=CROM Additionally, theses pages have self-referential canonical tags, so I would think I'd be covered, but I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs, despite Webmaster Tools not reporting any errors. As I see it, I have two options: Manually tell Google that these parameters have no effect on page content via the URL Parameters section in Webmaster Tools (in case Google is unable to automatically detect this, and I am being penalized as a result). Add "Disallow: *?" to hide all query/parameter URLs from Google. My concern here is that most backlinks include the parameters, and in some cases these parameter URLs outrank the original. Any thoughts?0 -
Why is /home used in this company's home URL?
Just working with a company that has chosen a home URL with /home latched on - very strange indeed - has anybody else comes across this kind of homepage URL "decision" in the past? I can't see why on earth anybody would do this! Perhaps simply a logic-defying decision?
Intermediate & Advanced SEO | | McTaggart0 -
Problem: Magento prioritises product URL's without categories?
HI there, we are moving a website from Shoptrader to Magento, which has 45.000 indexations.
Intermediate & Advanced SEO | | onlinetrend
yes shoptrader made a bit of a mess. Trying to clean it up now. there is a 301 redirect list of all old URL's pointing to the new one product can exist in multiple categories want to solve this with canonical url’s for instance: shoptrader.nl/categorieA/product has 301 redirect towards magento.nl/nl/categorieA/product shoptrader.nl/categorieA/product-5531 has 301 redirect towards magento.nl/nl/categorieA/product shoptrader.nl/categorieA/product¤cy=GBP has 301 redirect towards magento.nl/nl/categorieA/product shoptrader.nl/categorieB/product has 301 redirect towards magento.nl/nl/categorieB/product, has canonical tag towards magento.nl/nl/categorieA/product shoptrader.nl/categorieB/product?language=nl has 301 redirect towards magento.nl/nl/categorieB/product, has canonical tag towards magento.nl/nl/categorieA/product Her comes the problem:
New developer insists on using /productname as canonical instead of /category/category/productname, since Magento says so. The idea is now to redirect to /category/category/productname and there will be a canonical URL on these pages pointing to /productname, loosing some link juice twice. So in the end indexation will take place on /productname … if Google picks it up the 301 + canonical. Would be more adviseable to direct straight to /productname (http://moz.com/community/q/is-link-juice-passed-through-a-301-and-a-canonical-tag), but I prefer to point to one URL with categories attached. Which has more advantages(?): clear menustructure able to use subfolders in mobile searchresults missing breadcrumb What would you say?0 -
Should we use URL parameters or plain URL's=
Hi, Me and the development team are having a heated discussion about one of the more important thing in life, i.e. URL structures on our site. Let's say we are creating a AirBNB clone, and we want to be found when people search for apartments new york. As we have both have houses and apartments in all cities in the U.S it would make sense for our url to at least include these, so clone.com/Appartments/New-York but the user are also able to filter on price and size. This isn't really relevant for google, and we all agree on clone.com/Apartments/New-York should be canonical for all apartment/New York searches. But how should the url look like for people having a price for max 300$ and 100 sqft? clone.com/Apartments/New-York?price=30&size=100 or (We are using Node.js so no problem) clone.com/Apartments/New-York/Price/30/Size/100 The developers hate url parameters with a vengeance, and think the last version is the preferable one and most user readable, and says that as long we use canonical on everything to clone.com/Apartments/New-York it won't matter for god old google. I think the url parameters are the way to go for two reasons. One is that google might by themselves figure out that the price parameter doesn't matter (https://support.google.com/webmasters/answer/1235687?hl=en) and also it is possible in webmaster tools to actually tell google that you shouldn't worry about a parameter. We have agreed to disagree on this point, and let the wisdom of Moz decide what we ought to do. What do you all think?
Intermediate & Advanced SEO | | Peekabo0 -
Does Google index url with hashtags?
We are setting up some Jquery tabs in a page that will produce the same url with hashtags. For example: index.php#aboutus, index.php#ourguarantee, etc. We don't want that content to be crawled as we'd like to prevent duplicate content. Does Google normally crawl such urls or does it just ignore them? Thanks in advance.
Intermediate & Advanced SEO | | seoppc20120 -
Multiple URL's exist for the same page, canonicaliazation issue?
All of the following URL's take me to the same page on my site: 1. www.mysite.com/category1/subcategory.aspx 2. www.mysite.com/subcategory.aspx 3. www.mysite.com/category1/category1/category1/subcategory.aspx All of those pages are canonicalized to #1, so is that okay? I was told the following my a company trying to make our sitemap: "the site's platform dynamically creates URLs that resolve as 200 and should be 404. This is a huge spider trap for any search engine and will make them wary of crawling the site." What would I need to do to fix this? Thanks!
Intermediate & Advanced SEO | | pbhatt0 -
Why is my site's 'Rich Snippets' information not being displayed in SERPs?
We added hRecipe microformats data to our site in April and then migrated to the Schema.org Recipe format in July, but our content is still not being displayed as Rich Snippets in search engine results. Our pages validate okay in the Google Rich Snippets Testing Tool. Any idea why they are not being displayed in SERP's? Thanks.
Intermediate & Advanced SEO | | Techboy0