Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Why is rel="canonical" pointing at a URL with parameters bad?
-
Context
Our website has a large number of crawl issues stemming from duplicate page content (source: Moz).
According to an SEO firm which recently audited our website, some amount of these crawl issues are due to URL parameter usage. They have recommended that we "make sure every page has a Rel Canonical tag that points to the non-parameter version of that URL…parameters should never appear in Canonical tags."
Here's an example URL where we have parameters in our canonical tag...
http://www.chasing-fireflies.com/costumes-dress-up/womens-costumes/
rel="canonical" href="http://www.chasing-fireflies.com/costumes-dress-up/womens-costumes/?pageSize=0&pageSizeBottom=0" />
Our website runs on IBM WebSphere v 7.
Questions
- Why it is important that the rel canonical tag points to a non-parameter URL?
- What is the extent of the negative impact from having rel canonicals pointing to URLs including parameters?
- Any advice for correcting this?
Thanks for any help!
-
Thanks for the response, Eric.
My research suggested the same plan of attack: 1) fixing the canonical tags and 2) Google Search Console URL Parameters. It's helpful to get your confirmation.
My best guess is that the parameters you've cited above are not needed for every URL. I agree that this looks like something WebSphere Commerce probably controls. I'm a few organizational layers removed from whoever set this up for us. I'll try to track down where we can control that.
-
Thanks Peter!
-
Peter has a great answer with some good resources referenced, and i'll try to add on a little bit:
1. Why it is important that the rel canonical tag points to a non-parameter URL?
It's important to use clean URLs so search engines can understand the site structure (like Peter mentioned), which will help reduce the potential for index bloat and ranking issues. The more pages out there containing the same content (ie duplicate content), the harder it will be for search engines to determine which is the best page to show in search results. While there is no "duplicate content penalty" there could be a self inflicted wound by providing too many similar options. The canonical tag is supposed to be a level of control for you to tell Google which page is the most appropriate version. In this case it should be the clean URL since that will be where you want people to start. Users can customize from there using faceted navigation or custom options.
2. What is the extent of the negative impact from having rel canonicals pointing to URLs including parameters?
Basically duplicate content and indexing issues. Both of those things you really want to avoid when running an eComm shop since that will make your pages compete with each other for ranking. That could cost ranking, visits, and revenue if implemented wrong.
3. Any advice for correcting this?
Fix the canonical tags on the site would be your first step. Next you would want to exclude those parameters in the parameter handling section of Google Search Console. That will help by telling Google to ignore URLs with the elements you add in that section. It's another step to getting clean URLs showing up in search results.
I tried getting to http://www.chasing-fireflies.com/costumes-dress-up/mens-costumes/ and realize the parameters are showing up by default like: http://www.chasing-fireflies.com/costumes-dress-up/mens-costumes/#w=*&af=cat2:costumedressup_menscostumes%20cat1:costumedressup%20pagetype:products
Are the parameters needed for every URL? Seems like this is a websphere commerce setup kind of thing.
-
Clean (w/o parameters) canonical URL helps Google to understand better your url structure and avoid several mistakes:
https://googlewebmastercentral.blogspot.bg/2013/04/5-common-mistakes-with-relcanonical.html <- mistake N:1
http://www.hmtweb.com/marketing-blog/dangerous-rel-canonical-problems/ <- mistake N:4So - your company that giving this advise is CORRECT! You should provide naked URLs everywhere when it's possible.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can Google read content that is hidden under a "Read More" area?
For example, when a person first lands on a given page, they see a collapsed paragraph but if they want to gather more information they press the "read more" and it expands to reveal the full paragraph. Does Google crawl the full paragraph or just the shortened version? In the same vein, what if you have a text box that contains three different tabs. For example, you're selling a product that has a text box with overview, instructions & ingredients tabs all housed under the same URL. Does Google crawl all three tabs? Thanks for your insight!
Intermediate & Advanced SEO | | jlo76130 -
Partial Match or RegEx in Search Console's URL Parameters Tool?
So I currently have approximately 1000 of these URLs indexed, when I only want roughly 100 of them. Let's say the URL is www.example.com/page.php?par1=ABC123=&par2=DEF456=&par3=GHI789= All the indexed URLs follow that same kinda format, but I only want to index the URLs that have a par1 of ABC (but that could be ABC123 or ABC456 or whatever). Using URL Parameters tool in Search Console, I can ask Googlebot to only crawl URLs with a specific value. But is there any way to get a partial match, using regex maybe? Am I wasting my time with Search Console, and should I just disallow any page.php without par1=ABC in robots.txt?
Intermediate & Advanced SEO | | Ria_0 -
Dilemma about "images" folder in robots.txt
Hi, Hope you're doing well. I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
Intermediate & Advanced SEO | | Modbargains1 -
Using a lot of "Read More" Hidden text
My site has a LOT of "read more" and when a user click they will see a lot of text. "read more" is dark blue bold and clear to the user. It is the perfect for the user experience, since right below I have pictures and videos which is what most users want. Question: I expect few users will click "Read more" (however, some users will appreciate chance to read and learn more) and I wonder if search engines may think I am hiding text and this is a risky approach or simply discount the text as having zero value from an SEO perspective? Or, equally important: If the text was NOT hidden with a "Read more" would the text actually carry more SEO value than if it is hidden under a "read more" even though users will NOT read the text anyway? If yes, reason may be: when the text is not hidden, search engines cannot see that users are not reading it and the text carry more weight from an SEO perspective than pages where text is hidden under a "Read more" where users rarely click "read more".
Intermediate & Advanced SEO | | khi50 -
Multiple 301 redirects for a HTTPS URL. Good or bad?
I'm working on an ecommerce website that has a few snags and issues with it's coding. They're using https, and when you access the website through domain.com, theres a 301 redirect to http://www.domain.com and then this, in turn, redirected to https://www.domain.com. Would this have a deterimental effect or is that considered the best way to do it. Have the website redirect to http and then all http access is redirected to the https URL? Thanks
Intermediate & Advanced SEO | | jasondexter0 -
Rel=canonical tag on original page?
Afternoon All,
Intermediate & Advanced SEO | | Jellyfish-Agency
We are using Concrete5 as our CMS system, we are due to change but for the moment we have to play with what we have got. Part of the C5 system allows us to attribute our main page into other categories, via a page alaiser add-on. But what it also does is create several url paths and duplicate pages depending on how many times we take the original page and reference it in other categories. We have tried C5 canonical/SEO add-on's but they all seem to fall short. We have tried to address this issue in the most efficient way possible by using the rel=canonical tag. The only issue is the limitations of our cms system. We add the canonical tag to the original page header and this will automatically place this tag on all the duplicate pages and in turn fix the problem of duplicate content. The only problem is the canonical tag is on the original page as well, but it is referencing itself, effectively creating a tagging circle. Does anyone foresee a problem with the canonical tag being on the original page but in turn referencing itself? What we have done is try to simplify our duplicate content issues. We have over 2500 duplicate page issues because of this aliasing add-on and want to automate the canonical tag addition, rather than go to each individual page and manually add this tag, so the original reference page can remain the original. We have implemented this tag on one page at the moment with 9 duplicate pages/url's and are monitoring, but was curious if people had experienced this before or had any thoughts?0 -
Any penalty for having rel=canonical tags on every page?
For some reason every webpage of our website (www.nathosp.com) has a rel=canonical tag. I'm not sure why the previous SEO manager did this, but we don't have any duplicate content that would require a canonical tag. Should I remove these tags? And if so, what's the advantage - or disadvantage of leaving them in place? Thank you in advance for your help. -Josh Fulfer
Intermediate & Advanced SEO | | mhans1