Crawl Diagnostics bringing 20k+ errors as duplicate content due to session ids
-
Signed up to the trial version of Seomoz today just to check it out as I have decided I'm going to do my own SEO rather than outsource it (been let down a few times!). So far I like the look of things and have a feeling I am going to learn a lot and get results.
However I have just stumbled on something. After Seomoz dones it's crawl diagnostics run on the site (www.deviltronics.com) it is showing 20,000+ plus errors. From what I can see almost 99% of this is being picked up as erros for duplicate content due to session id's, so i am not sure what to do!
I have done a "site:www.deviltronics.com" on google and this certainly doesn't pick up the session id's/duplicate content. So could this just be an issue with the Seomoz bot. If so how can I get Seomoz to ignore these on the crawl?
Can I get my developer to add some code somewhere.
Help will be much appreciated. Asif
-
Hello Tom and Asif,
First of all Tom thanks for the excellent blog post re google docs.
We are also using the Jshop platform for one of our sites. And am not sure whether it is working correctly in terms of SEO. I just ran an seomoz crawl of the site and found that every single link in the list has a rel canonical in it, even the ones with session id's.
Here is an example:
www.strictlybeautiful.com/section.php/184/1/davines_shampoo/d112a41df89190c3a211ec14fdd705e9
www.strictlybeautiful.com/section.php/184/1/davines_shampoo
As Asif has pointed out the Jshop people say they have programmed it so that google cannot pick up the session ids, firstly is that even possible? And if I assume thats not an issue then what about the fact that every single page on the site has a rel canonical link on it?
Any help would be much appreciated.
<colgroup><col width="1074"></colgroup>
| |
| | -
Asif, here's the page with the information on the SEOmoz bot.
-
Thanks for the reply Tom. Spoke to our developer he has told me that the website platform (Jshop) does not show session ID's to the search engines so we are ok on that side. However as it doesn't recognise the Seomoz bot it shows it the session ID's. Do you know where I can find info on the Seomoz bot so we can see what it identifies itself as so it can be added to the list of recognised spiders?
Thanks
-
Hi Asif!
Firstly - I'd suggest that as soon as possible you address the core problem - the use of session ids in the URL. There are not many upsides to the approach and there are many downsides.That it doesn't show up with the site: command doesn't mean it isn't having a negative impact.
In the meantime, you should add a rel=canonical tag to all the offending pages pointing to the URL without the session id. Secondly, you could use robots.txt to block the SEOmoz bot from crawling pages with session ids, but it may affect the bots ability to crawl the site if all the links it is presented with are with session ids - which takes us back around to fixing the core problem.
Hope this helps a little!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Using Weglot on wordpress (errors)
Good day to you all, Does anyone have experience of the errors being pulled up by Moz about the utility of the weglot plugin on Wordpress? Moz is pulling up URLs such as: https://www.ibizacc.com/es/chapparal-2/?wg-choose-original=false These are classified under "redirect issues" and 99% of the pages are with the ?wg-choose parameter in the URL. Is this having an actual negative impact on my search or is it something more Moz related being highlighted. Any advice be appreciated and a resolution .. Im thinking I could exclude this parameter.
Moz Pro | | alwaysbeseen0 -
Duplicate Content in WordPress Taxonomies & Noindex, Follow
Hello Moz Community, We are seeing duplicate content issues in our Moz report for our WordPress site’s Tag pages. After a bit of research, it appears one of the best solutions is to set Tag pages to “no index, follow” within Yoast. That makes sense, but we have a few questions: In doing this, how are we affecting our opportunity to show up in search results? Are there any other repercussions to making this change? What would it take to make the content on these pages be seen as unique?
Moz Pro | | CoreyHicks1 -
404 errors in SEOMoz crawl tool
I currently have several 404 errors in the latest crawls from SEOMoz. Here is an example of the error. http://dealerplatform.com/blog/2011/10/23/videos-for-auto-dealers/www.dealerplatform.com/ In all cases the error is a result of www.dealerplatform.com being added to the real url. Anyone seen this before? The site is a wordpress mutlisite. I don't see where this incorrect link is showing up anywhere on the website. Any advice would be helpful. Thanks
Moz Pro | | Chris_Gregory0 -
How long is a full crawl?
It's been now over 3 days that the dashboard for one of our campaigns shows "Next Crawl in Progress!". I am not complaining about the length... but I have to agree that SEOMoz is quite addictive, and it's quite frustrating to see that everyday 🙂 Thanks
Moz Pro | | jgenesto0 -
Duplicate Content
I have tried searching for an exact example of the issues I am seeing, but didn't come up with anything. I decided to post my own question so I can get a direct answer on what I am experiencing. I recently took over a website and its' existing SEO practices with it. Upon placing the site on SEOmoz, I received many (LOTS) of duplicate content warnings. Pretty much, this is how the website is setup: domain.com/keyword-is-here/ but it is also coming up as domain.com/keyword-is-here/index.htm - Should I setup a redirect so domain.com/keyword-is-here/index.htm points to domain.com/keyword-is-here.htm or should I just leave it alone since it's pointing to the same exact? Any information on this questions is greatly appreciated in advance.
Moz Pro | | EQ-Richie0 -
Campaign 4XX error gives duplicate page URL
I ran the report for my site and had many more 4xx errors than I've had in the past month. I updated my .htaccess to include 301 statements based on Google Webmaster Tools Crawl Errors. Google has been reporting a positive downward trend in my errors, but my SEOmoz campaign has shown a dramatic increase in the 4xx pages. Here is an example of an 4xx URL page: http://www.maximphotostudio.net/engagements/266/inniswood_park_engagements/http:%2F%2Fwww.maximphotostudio.net%2Fengagements%2F266%2Finniswood_park_engagements%2F This is strange because URL: http://www.maximphotostudio.net/engagements/266/inniswood_park_engagements/ is valid and works great, but then there is a duplicate entry with %2F representing forward slashes and 2 http statements in each link. What is the reason for this?
Moz Pro | | maximphotostudio1 -
Why are these pages considered duplicate page content?
A recent crawl diagnostic for a client's website had several new duplicate page content errors. The problem is, I'm not sure where the error comes from since the content in the webpage is different from one another. Here's the pages that SEOMOZ reported to have duplicate page content errors: http://www.imaginet.com.ph/wireless-internet-service-providers-term http://www.imaginet.com.ph/antivirus-term http://www.imaginet.com.ph/berkeley-internet-name-domain http://www.imaginet.com.ph/customer-premises-equipment-term The only thing similar that I see is the headline which says "Glossary Terms Used in this Site" - I hope that the one sentence is the reason for the error. Any input is appreciated as I want to find out the best solution for my client's website errors. Thanks!
Moz Pro | | TheNorthernOffice790 -
Crawl Issues
My website - qtmoving.com - has 26 articles and when the SEOmoz did a crawl it only found 13 articles. Can someone please give me some insight as to why not all pages are being crawled.
Moz Pro | | CohesiveMarketing0