Why is rel="canonical" pointing at a URL with parameters bad?
-
Context
Our website has a large number of crawl issues stemming from duplicate page content (source: Moz).
According to an SEO firm which recently audited our website, some amount of these crawl issues are due to URL parameter usage. They have recommended that we "make sure every page has a Rel Canonical tag that points to the non-parameter version of that URL…parameters should never appear in Canonical tags."
Here's an example URL where we have parameters in our canonical tag...
http://www.chasing-fireflies.com/costumes-dress-up/womens-costumes/
rel="canonical" href="http://www.chasing-fireflies.com/costumes-dress-up/womens-costumes/?pageSize=0&pageSizeBottom=0" />
Our website runs on IBM WebSphere v 7.
Questions
- Why it is important that the rel canonical tag points to a non-parameter URL?
- What is the extent of the negative impact from having rel canonicals pointing to URLs including parameters?
- Any advice for correcting this?
Thanks for any help!
-
Thanks for the response, Eric.
My research suggested the same plan of attack: 1) fixing the canonical tags and 2) Google Search Console URL Parameters. It's helpful to get your confirmation.
My best guess is that the parameters you've cited above are not needed for every URL. I agree that this looks like something WebSphere Commerce probably controls. I'm a few organizational layers removed from whoever set this up for us. I'll try to track down where we can control that.
-
Thanks Peter!
-
Peter has a great answer with some good resources referenced, and i'll try to add on a little bit:
1. Why it is important that the rel canonical tag points to a non-parameter URL?
It's important to use clean URLs so search engines can understand the site structure (like Peter mentioned), which will help reduce the potential for index bloat and ranking issues. The more pages out there containing the same content (ie duplicate content), the harder it will be for search engines to determine which is the best page to show in search results. While there is no "duplicate content penalty" there could be a self inflicted wound by providing too many similar options. The canonical tag is supposed to be a level of control for you to tell Google which page is the most appropriate version. In this case it should be the clean URL since that will be where you want people to start. Users can customize from there using faceted navigation or custom options.
2. What is the extent of the negative impact from having rel canonicals pointing to URLs including parameters?
Basically duplicate content and indexing issues. Both of those things you really want to avoid when running an eComm shop since that will make your pages compete with each other for ranking. That could cost ranking, visits, and revenue if implemented wrong.
3. Any advice for correcting this?
Fix the canonical tags on the site would be your first step. Next you would want to exclude those parameters in the parameter handling section of Google Search Console. That will help by telling Google to ignore URLs with the elements you add in that section. It's another step to getting clean URLs showing up in search results.
I tried getting to http://www.chasing-fireflies.com/costumes-dress-up/mens-costumes/ and realize the parameters are showing up by default like: http://www.chasing-fireflies.com/costumes-dress-up/mens-costumes/#w=*&af=cat2:costumedressup_menscostumes%20cat1:costumedressup%20pagetype:products
Are the parameters needed for every URL? Seems like this is a websphere commerce setup kind of thing.
-
Clean (w/o parameters) canonical URL helps Google to understand better your url structure and avoid several mistakes:
https://googlewebmastercentral.blogspot.bg/2013/04/5-common-mistakes-with-relcanonical.html <- mistake N:1
http://www.hmtweb.com/marketing-blog/dangerous-rel-canonical-problems/ <- mistake N:4So - your company that giving this advise is CORRECT! You should provide naked URLs everywhere when it's possible.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do with parameter urls?
We have a ton of ugly parameter urls that are coming up in google, in semrush, etc. What do we do with them? I know they can cause issues. EX https://www.hibbshomes.com/wp-content/themes/highstand/assets/js/cubeportfolio/js/jquery.cubeportfolio.min.js?ver=6.3
Intermediate & Advanced SEO | | stldanni0 -
Canonical page 1 and rel=next/prev
Hi! I'm checking a site that has something like a News section, where they publish some posts, quite similar to a blog.
Intermediate & Advanced SEO | | teconsite
They have a canonical url pointing to the page=1. I was thinking of implementing the rel=next/ prev and the view all page and set the view all page as the canonical. But, as this is not a category page of an ecommerce site, and it would has more than 100 posts inside in less than a year, It made me think that maybe the best solution would be the following Implementing rel=next/prev
Keep page 1 as the canonical version. I don't want to make the users wait for a such a big page to load (a view all with more than 100 elements would be too much, I think) What do you think about this solution? Thank you!0 -
Dilemma about "images" folder in robots.txt
Hi, Hope you're doing well. I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
Intermediate & Advanced SEO | | Modbargains1 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
How use Rel="canonical" for our Website
How is the best way to use Rel="canonical" for our website www.ofertasdeemail.com.br, for we can say goodbye for duplicated pages? I appreciate for every help. I also hope to contribute to the SEOmoz community. Sincerely,
Intermediate & Advanced SEO | | ZZNINTERNETMEDIAGROUP
Amador Goncalves0 -
Can you Canonical to a URL in a different folder under the same domain?
I want to know if it's possible to add a canonical tag to a URL that points to a URL under a different folder. Content is just about the same. Here's an example (fake urls and product, but structure and parameters are similar to my client's website): domain.com/toy-ducks-results.aspx?color=Purple&model=Elvis domain.com/toy-ducks-details.aspx?color=Purple&model=Elvis&style=Sparkly Let's say that my purple Elvis ducks are really popular. Is there any harm in putting a rel=canonical on the Sparkly Elvis ducks page to the purple Elvis ducks page? Even though they are two different folders? /toy-ducks-results and /toy-ducks-details So, in effect, the preferred folder is /toy-ducks-results Thanks in advance for any help.
Intermediate & Advanced SEO | | EEE30 -
Can I use a "no index, follow" command in a robot.txt file for a certain parameter on a domain?
I have a site that produces thousands of pages via file uploads. These pages are then linked to by users for others to download what they have uploaded. Naturally, the client has blocked the parameter which precedes these pages in an attempt to keep them from being indexed. What they did not consider, was they these pages are attracting hundreds of thousands of links that are not passing any authority to the main domain because they're being blocked in robots.txt Can I allow google to follow, but NOT index these pages via a robots.txt file --- or would this have to be done on a page by page basis?
Intermediate & Advanced SEO | | PapaRelevance0 -
How to keep the link juice in E-commerce to an "out of stock" products URL?
I am running an e-commerce business where I sell fashion jewelry. We usually have 500 products to offer and some of them we have only one in stock. What happens is that many of our back links are pointed directly to a specific product, and when a product is sold out and no longer is in stock the URL becomes inactive, and we lose the link juice. What is the best practice or tool to 301-redirect many URLs at the same time without going and changing one URL at a time? Do you have any other suggestions on how to manage an out of stock product but still maintain the link juice from the back link? Thanks!
Intermediate & Advanced SEO | | ikomorin0