Craw Diagnostics Questions
-
SEO Moz is reporting that I have 50+ pages with a duplicate content issue based on this URL: http://www. f r e d aldous.co.uk/art-shop/art-supplies/art-canvas.html?manufacturer=178
But I have included this tag in the source: rel="canonical" href="http://www.f r e daldous.co.uk/art-shop/art-supplies/art-canvas.html"/>
(I have purposefully added white space to the URLs in this message as I'm not sure about the rules for posting links here)
I though this "canonical" tag prevented the duplicate content being indexed?
is the reporting by SEOMoz wrong or being over cautious?
-
Hi Niall,
This isn't a case of the canonical tag being properly applied, but a case where two or more pages are so similar in code that they are setting off the SEOmoz duplicate content flags.
First of all, those pages look different to us humans. But the SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.
In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.
Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/
I ran the pages through a couple of tools which showed 98% HTML similarity. And 99% text similarity.
For perspective, take a look at Google's cached versions of one of these pages. This is how googlebot sees the page: http://webcache.googleusercontent.com/search?q=cache:mdybPKIjOxUJ:www.fredaldous.co.uk/craft-shop/general-crafts.html+http://www.fredaldous.co.uk/craft-shop/general-crafts.html&hl=en&gl=us&strip=1
That, as we say, is a lot of links!
Since Panda, when I see a site with this many navigation links, I usually advise them to restructure their site architecture into more of a Pyramid shape, so that you reduce the overall navigation on each page.
Hope this helps! Best of luck with your SEO.
-
It claims that this is one of the duplicate URLS:
http://www.f r e daldous.co.uk/photo-gift/design-led-gifts.html?manufacturer=436
Now I am confused as page is no where near duplicate content of the URL I posted 1st.
Can anyone explain this?
-
Helo Niall,
It seems that you have inserted the rel="canonical" href= in the correct spot. I think the software is giving you the potentials which is always a bonus precaution. I really don't want to make a premature determination without knowing which 50 pages are showing up as duplicate. A deeper look will allow me to give you a more accurate response.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Some questions about URL structure and multi country website
Gajanand angela dayHi,
Technical SEO | | Shahjahaaan
I have a question from SEO experts and web developers.
I want to setup a job website for 5 countries. for each country i will provide daily jobs listing on the basis of
1. jobs by categories - for example : accounting jobs. IT jobs, Sales jobs
2. jobs by city - for example : jobs in boston, jobs in chicago
3. jobs by companies for example : jobs in facebook, jobs in emirates case :
a company name " emirates " located in "boston" having vacancy of "accounting job " having position of full time this case job will be present in following categories . 1. accounting jobs in boston
2. jobs in boston
3. jobs in emirates and open any above option there will be filter box on left side showing
position i.e full time
salary i.e 1000-1500
location i.e boston,chicago Q.1
i want to know when user search on google these terms "accounting jobs in boston " or "jobs in boston" or "jobs in emirates" same job will display which url structure is recommended in for each search term? Q.2 how we can do on page SEO for these terms because jobs listing will be changing daily because of new jobs addition and content is changing not Q.3 should i create website on separate domains for each country or same domain but with different folders in it
.co.uk or com/uk for UK and .ae OR .com/uae for UAE Note : i will also attach blog on it and each blog will focus on specific country knowledge for example for USA , how to find jobs in new york and for UAE how to find jobs in Dubai etc . Thanks in Advance0 -
Rel=canonical on landing page question
Currently we have two versions of a category page on our site (listed below) Version A: www.example.com/category • lives only in the SERPS but does not live on our site navigation • has links • user experience is not the best Version B: www.example.com/category?view=all • lives in our site navigation • has a rel=canonical to version A • very few links and doesn’t appear in the SERPS • user experience is better than version A Because the user experience of version B is better than version A I want to take out the rel=canonical in version B to version A and instead put a rel=canonical to version B in version A. If I do this will version B show up in the SERPS eventually and replace version A? If so, how long do you think this would take? Will this essentially pass page rank from version A to version B
Technical SEO | | znotes0 -
Questionable SEO
Chess Telecom appears first when you search for 'business phone lines' in the UK so I used a campaign to check them out. It seems they've got tons of unrelated links and using comment spamming to increase their ranking. Along with fake twitter accounts and other things. Search for 'jewel jubic chess' and you'll see what i mean. I assumed this wasnt a good idea and been trying to get my link on relevant websites only. Any comments or suggestions? Should I simply trust that google will hopefully punish them eventually? Or should I be fighting fire with fire? Thanks Dan
Technical SEO | | DanFromUK0 -
Mod rewrite question
Sorry in advance if this isn't the best place to ask this question. Google Webmaster Tools has recently identified a ton of "Not Found" pages, which are actual pages with some digits appended at the end. For example, suppose an actual page on my blog is: (A) http://www.example.com/blog/2012/09/my-post-title/ This page works just fine. However, GWT has identified the following page as a "not found" page: (B) http://www.example.com/blog/2012/09/my-post-title/9157586677/1846732913010 This appears to be happening to hundreds of posts on my site. In each case, the "9157586677" portion of the URL is identical, but the remaining 13 digits change from page to page. I haven't been able to determine exactly what is causing this to happen - it's probably a social plug-in for Wordpress, or perhaps Disqus, but I'm not sure which one. I'll go through a process of elimination to narrow it down over the coming week. As a quick fix, I'd like to create a ModRewrite rule so that requests for (B) get 301 redirected to (A). Since there are hundreds of posts, I need to do this in a way that works regardless of what's in the "/2012/09/my-post-title/" part of the URL. Unfortunately, mod-rewrite is outside of my area of expertise. Can somebody please suggest how I can handle this? Thanks in advance. PS - As for tracking down the cause, I've looked at the source of the pages in the "Linked From" area of GWT and the Not Found link is nowhere to be found. That is why I assume the bad link is being generated by some javascript that is a part of one of my plug-ins. Update: It seems like Disqus is the source of these phantom links. There's considerable discussion here. I'll continue searching for a long-term solution. Meanwhile, I'd still appreciate help with the mod-rewrite question above. Thanks again.
Technical SEO | | ahirai0 -
Drupal Question
So on our site we have a plugin for our fan gallery. The issue is that I am getting a lot of duplication errors and it's saying the URL is too long and all the errors are coming from the Fan Gallery, which has over 8,000 errors. It seems to be pulling a long form query URL that has over 100 characters. You can't physically see it on the site, but the crawlers can. Anyway I'm trying to figure out a fix for this. One method would be to just stop those pages from being crawled, but I would hate to do that as the fan gallery for us would be a great source of links and content. So I'm wondering if anyone else has had an issue with these types of plugins before where the user can upload a photo or do a video embed and then it submits to the site. If you have a better method please let me know. I usually work on E-comm platforms so my experience with drupal is limited.
Technical SEO | | KateGMaker0 -
Robots.txt question
What is this robots.txt telling the search engines? User-agent: * Disallow: /stats/
Technical SEO | | DenverKelly0 -
SEO-MOZ bar question on root vs subdomain / canonicalization issues
When I look at the SEO-MOZ bar for our site and click next to subdomain (# links from #domains) it shows my main incoming links etc. but when I click on root domain ity shows mydomain/default.asp and 4 incoming links as well as a message that says this url redirects to another url. Does this imply canonicalization issues or is there a 301 redirect to my non /default.asp correcting this issue. Thanks kindly, Howard
Technical SEO | | mrkingsley0 -
WordPress Question: Canonical field in Category Section of Yoast SEO Plug In
I've added the Yoast SEO Plug In for my word press blog. When I add a new category, there is a listing called "Edit Category". On this page there is a listing "Yoast WordPress SEO Settings." In this section, there are two fields in which I need guidance on what is supposed to be included. One: There is a field called "Canonical". What info is supposed to be entered in this field and how does it need to be formatted? Is it a URL. If so, what URL is supposed to go there? Two: Breadcrumbs title. What is the purpose of this field? (Isn't it OK to just use the category name as the breadcrumb title?)
Technical SEO | | EricVallee340