Craw Diagnostics Questions
-
SEO Moz is reporting that I have 50+ pages with a duplicate content issue based on this URL: http://www. f r e d aldous.co.uk/art-shop/art-supplies/art-canvas.html?manufacturer=178
But I have included this tag in the source: rel="canonical" href="http://www.f r e daldous.co.uk/art-shop/art-supplies/art-canvas.html"/>
(I have purposefully added white space to the URLs in this message as I'm not sure about the rules for posting links here)
I though this "canonical" tag prevented the duplicate content being indexed?
is the reporting by SEOMoz wrong or being over cautious?
-
Hi Niall,
This isn't a case of the canonical tag being properly applied, but a case where two or more pages are so similar in code that they are setting off the SEOmoz duplicate content flags.
First of all, those pages look different to us humans. But the SEOmoz web app uses a similarity threshold of 95% of the html code. This takes everything on the page, both hidden and visible into account.
In this case, it's counting all of the navigation and sidebar as well, which is significant. What's left of the unique content - the part that matters, makes up less than 5% of the code.
Here's a tool you can use to check the similarity: http://www.duplicatecontent.net/
I ran the pages through a couple of tools which showed 98% HTML similarity. And 99% text similarity.
For perspective, take a look at Google's cached versions of one of these pages. This is how googlebot sees the page: http://webcache.googleusercontent.com/search?q=cache:mdybPKIjOxUJ:www.fredaldous.co.uk/craft-shop/general-crafts.html+http://www.fredaldous.co.uk/craft-shop/general-crafts.html&hl=en&gl=us&strip=1
That, as we say, is a lot of links!
Since Panda, when I see a site with this many navigation links, I usually advise them to restructure their site architecture into more of a Pyramid shape, so that you reduce the overall navigation on each page.
Hope this helps! Best of luck with your SEO.
-
It claims that this is one of the duplicate URLS:
http://www.f r e daldous.co.uk/photo-gift/design-led-gifts.html?manufacturer=436
Now I am confused as page is no where near duplicate content of the URL I posted 1st.
Can anyone explain this?
-
Helo Niall,
It seems that you have inserted the rel="canonical" href= in the correct spot. I think the software is giving you the potentials which is always a bonus precaution. I really don't want to make a premature determination without knowing which 50 pages are showing up as duplicate. A deeper look will allow me to give you a more accurate response.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL structuring / redirect question
Hi there, I have a URL structuring / redirect question. I have many pages on my site but I set each page up to fall under one of two folders as I serve two unique markets and want each side to be indexed properly. I have SIDE A: www.domain/FOLDER-A.com and SIDE B: www.domain/FOLDER-B. The problem is that I have a page for www.domain.com and www.domain/FOLDER-A/page1.com but I do NOT have a page for www.domain/FOLDER-A. The reason for this is that I've opted to make what would be www.domain/FOLDER-A be www.domain.com and act the primary landing page the site. As a result, there is no page located at www.domain/FOLDER-A. My WordPress template (Divi by Elegant Themes) forced me to create a blank page to be able to build off the FOLDER-A framework. My question is that given I am forced to have this blank page, do I leave it be or create a 302 or 307 redirect to www.domain.com? I fear using a 301 redirect given I may want to utilize this page for content at some point in the future. This isn't the easiest post to follow so please let me know if I need to restate the question. Many thanks in advance!
Technical SEO | | KurtWSEO0 -
Another Duplicate Content - eCommerce Question!
We are manufacturers of about 15 products and our website provides information about the products. We also offer them for sale on the site. Recently we partnered with a large eCommerce site that sells many of these types of products. They lifted descriptions from our site for theirs and are now selling our products. They have higher DA than us. Will this cause a ranking problem for us? Should we write unique descriptions for them? Thanks!
Technical SEO | | Chris6610 -
Specific question about pagination prompted by Adam Audette's Presentation at RKG Summit
This question is prompted by something Adam Audette said in this excellent presentation: http://www.rimmkaufman.com/blog/top-5-seo-conundrums/08062012/ First, I will lay out the issues: 1. All of our paginated pages have the same URL. To view this in action, go here: http://www.ccisolutions.com/StoreFront/category/audio-technica , scroll down to the bottom of the page and click "Next" - look at the URL. The URL is: http://www.ccisolutions.com/StoreFront/IAFDispatcher, and for every page after it, the same URL. 2. All of the paginated pages with non-unique URLs have canonical tags referencing the first page of the paginated series. 3. http://www.ccisolutions.com/StoreFront/IAFDispatcher has been instructed to be neither crawled nor indexed by Google. Now, on to what Adam said in his presentation: At about minute 24 Adam begins talking about pagination. At about 27:48 in the video, he is discussing the first of three ways to properly deal with pagination issues. He says [I am somewhat paraphrasing]: "Pages 2-N should have self-referencing canonical tags - Pages 2-N should all have their own unique URLs, titles and meta descriptions...The key is, with this is you want deeper pages to get crawled and all the products on there to get crawled too. The problem that we see a lot is, say you have ten pages, each one using rel canonical pointing back to page 1, and when that happens, the products or items on those deep pages don't get get crawled...because the rel canonical tag is sort of like a 301 and basically says 'Okay, this page is actually that page.' All the items and products on this deeper page don't get the love." Before I get to my question, I'll just throw out there that we are planning to fix the pagination issue by opting for the "View All" method, which Adam suggests as the second of three options in this video, so that fix is coming. My question is this: It seems based on what Adam said (and our current abysmal state for pagination) that the products on our paginated pages aren't being crawled or indexed. However, our products are all indexed in Google. Is this because we are submitting a sitemap? Even so, are we missing out on internal linking (authority flow) and Google love because Googlebot is finding way more products in our sitemap that what it is seeing on the site? (or missing out in other ways?) We experience a lot of volatility in our rankings where we rank extremely well for a set of products for a long time, and then disappear. Then something else will rank well for a while, and disappear. I am wondering if this issue is a major contributing factor. Oh, and did I mention that our sort feature sorts the products and imposes that new order for all subsequent visitors? it works like this: If I go to that same Audio-Technica page, and sort the 125+ resulting products by price, they will sort by price...but not just for me, for anyone who subsequently visits that page...until someone else re-sorts it some other way. So if we merchandise the order to be XYZ, and a visitor comes and sorts it ZYX and then googlebot crawls, google would potentially see entirely different products on the first page of the series than the default order marketing intended to be presented there....sigh. Additional thoughts, comments, sympathy cards and flowers most welcome. 🙂 Thanks all!
Technical SEO | | danatanseo0 -
Questions about root domain setup
Hi There, I'm a recent addition to SEOmoz and over the past few weeks I've been trying to figure things out. This whole SEO process has been a bit of a brain burner but its slowly becoming a little more clearer. For awhile I noticed that I was unable to get Open Site Explorer to display information about my site. It mentioned that that there was not enough data for the URL. Too recent of a site, no links, etc. Eventually I changed the the URL to include "www." and it pulled up results. I also noticed that a few of my page warnings are because of duplicate page content. One page will be listed as http://enbphotos.com. The other will be listed as http://www.enbphotos.com. I guess I'm not sure what this all means and how to change it. I'm also not really sure what the terminology even is and something regarding root domain seemed appropriate but I'm not sure if it is accurate. Any help/suggestions/links would be appreciated! Thanks, Chris
Technical SEO | | enbphotos0 -
Link building question
ok so we paid the top firm in seo to help us build an seo strategy and i think we have a good one. We are changing our link building tactics and making more Pr related links and creating awesome content on blogs or our own site to generate traffic and links to our site. We have data from our engineer which should be interesting and we are going to sponsor events, do some link baiting with some of our articles, get a pr firm to get us some good articles on major sites and go to events around phily where we will have unique content and a unique perspective such as car shows ect. The problem is even though all the content will be linked to our site how do we link them. We got hit by penguin but in these articles or blogs should we use the anchor text for the word we are using. The company says dont do it right now bc we got hit with penguin and should only use the brand. I have no idea how only using the brand and not the keywords will magically make us rank for certain keywords. Anyone have an opinion. Thank you and we do pretty well with seo but we did get little bit of a hit with penguin that we are eliminating links and making a new way of thinking when it comes to link building. We also just hired a designer so we are going to build 100s of pages on the site to increase seo with unique content and that is also a goal of ours for the year. We have two marketers on staff and 4 programmers so we are able to do anything. Our urls are terrible but the rest of the site is pretty good
Technical SEO | | goldjake17880 -
Windows IIS 7 Redirect Question
I want to redirect the following 4 pages to the home page: http://www.phbalancedpool.com/pool-repair/pool_repair_arizona.html http://www.phbalancedpool.com/About%20Pool%20Cleaning%20Arizona/About_Page_Pool_Cleaning_Arizona.html http://www.phbalancedpool.com/specials/Pool%20Cleaning%20and%20Pool%20Repair%20Specials.html http://www.phbalancedpool.com/service-areas-in-arizona/Chandler_Gilbert_Mesa_Queen%20Creek_San%20Tan%20Valley.html This is what I am currently using for my Web.config file: <configuration></configuration> <match url=".*"></match> <add input="{HTTP_HOST}" pattern="^phbalancedpool.com$"></add> <action type="Redirect" url="http://www.phbalancedpool.com/{R:0}" <="" span="">redirectType="Permanent" /></action> <location path="pool-repair/pool_repair_arizona.html"></location> <location path="About%20Pool%20Cleaning%20Arizona/About_Page_Pool_Cleaning_Arizona.html"></location> <location path="specials/Pool%20Cleaning%20and%20Pool%20Repair%20Specials.html"></location> <location path="service-areas-in-arizona/Chandler_Gilbert_Mesa_Queen%20Creek_San%20Tan%20Valley.html"></location> Only the first one is actually redirecting and I can't figure out why. What do I need to do to fix this?
Technical SEO | | JordanJudson0 -
Robots.txt file question? NEver seen this command before
Hey Everyone! Perhaps someone can help me. I came across this command in the robots.txt file of our Canadian corporate domain. I looked around online but can't seem to find a definitive answer (slightly relevant). the command line is as follows: Disallow: /*?* I'm guessing this might have something to do with blocking php string searches on the site?. It might also have something to do with blocking sub-domains, but the "?" mark puzzles me 😞 Any help would be greatly appreciated! Thanks, Rob
Technical SEO | | RobMay0 -
Two basic questions re. Crawl Diagnostic results
I'm a novice...I've just run my crawl diagnostics and I wonder how important is it to a) Have meta-descriptions on every page, b) Have all titles less than 70 characters? Thanks in advance. Dan.
Technical SEO | | danfk0