Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Overly-Dynamic URL
-
Hi,
We have over 5000 pages showing under
Overly-Dynamic URL error
Our ecommerce site uses Ajax and we have several different filters like, Size, Color, Brand and we therefor have many different urls like,
http://www.dellamoda.com/Designer-Pumps.html?sort=price&sort_direction=1&use_selected_filter=Y
http://www.dellamoda.com/Designer-Accessories.html?sort=title&use_selected_filter=Y&view=all
http://www.dellamoda.com/designer-handbags.html?use_selected_filter=Y&option=manufacturer%3A&page3
Could we use the robots.txt file to disallow these from showing as duplicate content? and do we need to put the whole url in there?
like:
Disallow: /*?sort=price&sort_direction=1&use_selected_filter=Y
if not how far into the url should be disallowed?
So far we have added the following to our robots,txt
Disallow: /?sort=title
Disallow: /?use_selected_filter=Y
Disallow: /?sort=price
Disallow: /?clearall=Y
Just not sure if they are correct.
Any help would be greatly appreciated.
Thank you,Kami
-
Hi Kami,
It's unfortunate, but a number of modern day e-commerce platforms still suffer from poor canonicalisation and multiple URL's.
If possible, rather than blocking off access to those queries via robots.txt or meta, I'd start by trying to specify a canonical URL when a query is created.
E.G
Query: http://www.dellamoda.com/Designer-Accessories.html?sort=title&use_selected_filter=Y&view=all
Canonical: http://www.dellamoda.com/Designer-Accessories.htmlFailing that I'd try to implement a "follow,noindex" meta tag or via x-robots if you're any good with apache.
If that's still a no go, then try GWT, Google is getting much better at handling dynamic URL's within e-commerce platforms and you can specify which queries Google should ignore directly within GWT
There's a great post on Moz that deals with e-commerce and canonicalisation - http://www.seomoz.org/blog/qa-from-ecommerce-seo-fix-and-avoid-common-issues-webinar - I'd suggest starting there!
As a last resort I'd look to block the URL's within robots.txt, but this prevents crawlers from flowing freely within your site and can result in poor indexation.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sanity Check: NoIndexing a Boatload of URLs
Hi, I'm working with a Shopify site that has about 10x more URLs in Google's index than it really ought to. This equals thousands of urls bloating the index. Shopify makes it super easy to make endless new collections of products, where none of the new collections has any new content... just a new mix of products. Over time, this makes for a ton of duplicate content. My response, aside from making other new/unique content, is to select some choice collections with KW/topic opportunities in organic and add unique content to those pages. At the same time, noindexing the other 90% of excess collections pages. The thing is there's evidently no method that I could find of just uploading a list of urls to Shopify to tag noindex. And, it's too time consuming to do this one url at a time, so I wrote a little script to add a noindex tag (not nofollow) to pages that share various identical title tags, since many of them do. This saves some time, but I have to be careful to not inadvertently noindex a page I want to keep. Here are my questions: Is this what you would do? To me it seems a little crazy that I have to do this by title tag, although faster than one at a time. Would you follow it up with a deindex request (one url at a time) with Google or just let Google figure it out over time? Are there any potential negative side effects from noindexing 90% of what Google is already aware of? Any additional ideas? Thanks! Best... Mike
Intermediate & Advanced SEO | | 945010 -
If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?
If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?
Intermediate & Advanced SEO | | Gabriele_Layoutweb0 -
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred. My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically? Thanks in advance Moz community for your feedback.
Intermediate & Advanced SEO | | peteboyd0 -
Double hyphen in URL - bad?
Instead of a URL such as domain.com/double-dash/ programming wants to use domain.com/double--dash/ for some reason that makes things easier for them. Would a double dash in the URL have a negative effect on the page ranking?
Intermediate & Advanced SEO | | CFSSEO0 -
Removing UpperCase URLs from Indexing
This search - site:www.qjamba.com/online-savings/automotix gives me this result from Google: Automotix online coupons and shopping - Qjamba
Intermediate & Advanced SEO | | friendoffood
https://www.qjamba.com/online-savings/automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. and Google tells me there is another one, which is 'very simliar'. When I click to see it I get: Automotix online coupons and shopping - Qjamba
https://www.qjamba.com/online-savings/Automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. This is because I recently changed my program to redirect all urls with uppercase in them to lower case, as it appears that all lowercase is strongly recommended. I assume that having 2 indexed urls for the same content dilutes link juice. Can I safely remove all of my UpperCase indexed pages from Google without it affecting the indexing of the lower case urls? And if, so what is the best way -- there are thousands.0 -
Weird 404 URL Problem - domain name being placed at end of urls
Hey there. For some reason when doing crawl tests I'm finding pages with the domain name being tacked on the end and causing 404 errors.
Intermediate & Advanced SEO | | Jay328
For example: http://domainname.com/page-name/http://domainname.com This is happening to all pages, posts and even category type 1. Site is in Wordpress
2. Using Yoast SEO plugin Any suggestions? Thanks!0 -
Can you redirect specific sub domain URLs?
ello! We host our PDFs, Images, CSS all in a sub domain. For the question, let's call this sub.cyto.com. I've noticed a particular PDF doing really well, infact it has gathered valuable external links from high authoritative sites. To top it off, it gets good visits. I've been going back and forth with our developers to move this PDF to a subfolder structure.
Intermediate & Advanced SEO | | Bio-RadAbs
For example: www.cyto.com/document/xxxx.pdf In my perspective, if I move this and set up a permanent redirect, then all the external links the PDF gathered, link juice and future visits will be attributed to the main website. Since the PDF is existing in the subdomain, I can't even track direct visits nor get the link juice. It appears in top position of Google as well. My developer says it is better to keep images, pdf, css in the subdomain. I see his point and an idea I have is to: convert the pdf to a webpage. Set up a 301 redirect from the existing subdomain to this webpage Upload the pdf with a new name and link to it from the webpage, so users can download if they choose to. This should give me the existing rank juice. However, my question is whether you can set up a 301 redirect for just a single subdomain URL to a folder structure URL? sub.cyto.com/xxx.pdf to www.cyto.com/document/xxxx.pdf?0 -
Google News URL Structure
Hi there folks I am looking for some guidance on Google News URLs. We are restructuring the site. A main traffic driver will be the traffic we get from Google News. Most large publishers use: www.site.com/news/12345/this-is-the-title/ Others use www.example.com/news/celebrity/12345/this-is-the-title/ etc. www.example.com/news/celebrity-news/12345/this-is-the-title/ www.example.com/celebrity-news/12345/this-is-the-title/ (Celebrity is a channel on Google News so should we try and follow that format?) www.example.com/news/celebrity-news/this-is-the-title/12345/ www.example.com/news/celebrity-news/this-is-the-title-12345/ (unique ID no at the end and part of the title URL) www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ Others include the date. So as you can see there are so many combinations and there doesnt seem to be any unity across news sites for this format. Have you any advice on how to structure these URLs? Particularly if we want to been seen as an authority on the following topics: fashion, hair, beauty, and celebrity news - in particular "celebrity name" So should the celebrity news section be www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ or what? This is for a completely new site build. Thanks Barry
Intermediate & Advanced SEO | | Deepti_C0