Crawl Diagnostics Report
-
I'm a bit concerned about the results I'm getting from the Crawl Diagnostics Report.
I've updated the site with canonical urls to remove duplicate content and when I check the site - it all displays the right values, but the report, which has just finished crawling is still showing a lot of pages as duplicate content.
Simple example:
Both of them are in the duplicate content section although both have canonical url set as:
Does each crawl check the entire site from the beginning or just the pages it didn't have a chance to crawl the last time?
This is just one of 333 duplicate content pages, which have canonical url pointing to the right page.
Can someone please explain?
-
Yep!
That's why I really like the csv files because you can sort stuff and filter things down to specifically what you want to see.
-
Hi Kenny,
Thanks for getting back to me.
So is it just the way it is reported on the page and it's not the actual problem with the duplicate content?
-
Hi Sebastian,
Sorry for the confusion. Our software currently reports those urls as having both duplicate content and canonical tags. I find that the best way to view this information is by exporting your crawl diagnostic's csv file. You can easily locate the export functionality in the upper right of the crawl diagnostic page.
Kenny
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl Diagnostics Summary Problem
We added our website a Robots.txt file and there are pages blocked by robots.txt. Crawl Diagnostics Summary page shows there is no page blocked by Robots.txt. Why?
Moz Pro | | iskq0 -
Duplicate page report
We ran a CSV spreadsheet of our crawl diagnostics related to duplicate URLS' after waiting 5 days with no response to how Rogerbot can be made to filter. My IT lead tells me he thinks the label on the spreadsheet is showing “duplicate URLs”, and that is – literally – what the spreadsheet is showing. It thinks that a database ID number is the only valid part of a URL. To replicate: Just filter the spreadsheet for any number that you see on the page. For example, filtering for 1793 gives us the following result: | URL http://truthbook.com/faq/dsp_viewFAQ.cfm?faqID=1793 http://truthbook.com/index.cfm?linkID=1793 http://truthbook.com/index.cfm?linkID=1793&pf=true http://www.truthbook.com/blogs/dsp_viewBlogEntry.cfm?blogentryID=1793 http://www.truthbook.com/index.cfm?linkID=1793 | There are a couple of problems with the above: 1. It gives the www result, as well as the non-www result. 2. It is seeing the print version as a duplicate (&pf=true) but these are blocked from Google via the noindex header tag. 3. It thinks that different sections of the website with the same ID number the same thing (faq / blogs / pages) In short: this particular report tell us nothing at all. I am trying to get a perspective from someone at SEOMoz to determine if he is reading the result correctly or there is something he is missing? Please help. Jim
Moz Pro | | jimmyzig0 -
Crawl Diagnostics - unexpected results
I received my first Crawl Diagnostics report last night on my dynamic ecommerce site. It showed errors on generated URLs which simply are not produced anywhere when running on my live site. Only when running on my local development server. It appears that the Crawler doesn't think that it's running on the live site. For example http://www.nordichouse.co.uk/candlestick-centrepiece-p-1140.html will go to a Product Not Found page, and therefore Duplicate Content errors are produced. Running http://www.nhlocal.co.uk/candlestick-centrepiece-p-1140.html produces the correct product page and not a Product Not Found page Any thoughts?
Moz Pro | | nordichouse0 -
Campaign Crawl Report
Hello, Just a quicky, is there anyway I can do a crawl report for something in a campaign so I can compare the changes? I know you can do a separate crawl test, but it wont show the differences,and the next crawl date isnt untill the 28th.
Moz Pro | | Prestige-SEO0 -
Is there a way in in the pro campaign to initiate an instant crawl?
I have a campaign that's on a very small website at this point. It's only 9 pages, so a crawl would only take a few minutes. Is there a way to start a quick crawl at my request instead of having to wait a week to see if I fixed the issues it found during the last crawl? I've been hunting around, but I don't see a way.
Moz Pro | | mytouchoftech0 -
Colors in keyword Difficulty Report
Hi Everyone Two quick questions today 1. How can I find out hat the different colors within the keyword Difficulty Report represent and how can I see examples of how this information can help us with our data analysis? 2. The second question I have is regarding the Term Extractor. Seems when I ran a domain it provided the wrong data. For example, it stated that a certain keyword exists certain number of times within the description and title of the page but when I looked at the source this was not the case so it made studying the competition harder.
Moz Pro | | DRTBA
Any suggestions or has anyone else noticed this? Thanks in advance for all your help.0 -
On raking reports
Hello. I was wondering why doesn't SEOMoz update ranking reports daily as opposed to weekly. Rankings to change overnight and since other services do it (e.g sescout.com) I can't really why not this one.
Moz Pro | | phaistonian0 -
Question about when new crawls start
Hi everyone, I'm currently using the trial of seomoz and I absolutely love what I'm seeing. However, I have 2 different websites (one has over 10,000 pages and one has about 40 pages). I've noticed that the smaller website is crawled every few days. However, the larger site hasn't been crawled in a few days. Although both campaigns state that the sites won't be crawled until next Monday, is there any way to get the crawl to start sooner on the large site? The reason that I've asked is that I've implemented some changes that will likely decrease the amount of pages that are crawled simply based upon the recommendations on this site. So, I'm excited to see the potential changes. Thanks, Brian
Moz Pro | | beeneeb0