Crawl Diagnostics Report
-
I'm a bit concerned about the results I'm getting from the Crawl Diagnostics Report.
I've updated the site with canonical urls to remove duplicate content and when I check the site - it all displays the right values, but the report, which has just finished crawling is still showing a lot of pages as duplicate content.
Simple example:
Both of them are in the duplicate content section although both have canonical url set as:
Does each crawl check the entire site from the beginning or just the pages it didn't have a chance to crawl the last time?
This is just one of 333 duplicate content pages, which have canonical url pointing to the right page.
Can someone please explain?
-
Yep!
That's why I really like the csv files because you can sort stuff and filter things down to specifically what you want to see.
-
Hi Kenny,
Thanks for getting back to me.
So is it just the way it is reported on the page and it's not the actual problem with the duplicate content?
-
Hi Sebastian,
Sorry for the confusion. Our software currently reports those urls as having both duplicate content and canonical tags. I find that the best way to view this information is by exporting your crawl diagnostic's csv file. You can easily locate the export functionality in the upper right of the crawl diagnostic page.
Kenny
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Abnormal crawl issues appearing in my Moz results
I have been asked to look at a site for a friend and was more than surprised to see 16,9k crawl issues appear in the dashboard... of this 6,238 are duplicate page content and 5878 are duplicated page titles. What on earth is going on? I have spoken to the web developer as it appears there is a dev site somewhere and this is his response [Can I stress that Google determines which site was in the index first and then removes other sites it sees as having duplicate content. Our dev sites appearing in the search index would not affect your ranking due to duplicate content as Google would see your site as the first site with the content] As I cannot make contact with him, I am scratching my head, surely a dev site should be no-indexed, it sounds as though he is saying that its ok because Google will take the main site as the first site with the content... Very confused! Help need MOZ community. Manythanks, Sarah
Moz Pro | | Mutatio_Digital0 -
404 : Errors in crawl report - all pages are listed with index.html on a WordPress site
Hi Mozers, I have recently submitted a website using moz, which has pulled up a second version of every page on the WordPress site as a 404 error with index.html at the end of the URL. e.g Live page URL - http://www.autostemtechnology.com/applications/civil-blasting/ Report page URL - http://www.autostemtechnology.com/applications/civil-blasting/index.html The permalink structure is set as /%postname%/ For some reason the report has listed every page with index.html at the end of the page URL. I have tried a number of redirects in the .htaccess file but doesn't seem to work. Any suggestions will be strongly appreciated. Thanks
Moz Pro | | AmanziDigital0 -
Duplicate content in crawl despite canonical
Hi! I've had a bunch of duplicate content issues come up in a crawl, but a lot of them seem to have canonical tags implemented correctly. For example: http://www.alwayshobbies.com/brands/aztec-imports/-catg=Fireplaces http://www.alwayshobbies.com/brands/aztec-imports/-catg=Nursery http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables?page=0 Aztec http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables?page=1 Any ideas on what's happening here?
Moz Pro | | neooptic0 -
More complete campaign reports
SEOMoz campaigns include a lot more data than can be sent via email automatically with the custom reporting feature. So what do people do to send that data to clients? Do you not include it? Or export reports manually and send them on? Or something else?
Moz Pro | | antdesign0 -
Creating a SEO Report
We are looking to create a SEO report that is broken down by keywords. The traffic that the keywords generate for the site, the rankings in the search engines, the number of backlinks that have used the keyword as anchor text. We have a few tools that can do some of this, but are looking to find something that can aggregate all this info into a clean report. We are wondering if anyone knows a good website/application that can help manage a month-to-month report on the aspects above. Thanks!
Moz Pro | | insitegoogle0 -
How to run down the actual source of a 404 error that is reported.
In my 404 errors, the second entry is as follows: URL: http://www.virginiahomesandforeclosures.com/listing/0428387-lot-k-commerce-park-franklin-va-23851/REWIDX_URL_CDNimg/no-image.gif Is there a simple way to find the root or page in which this error was generated? IF I visit this page " http://www.virginiahomesandforeclosures.com/listing/0428387-lot-k-commerce-park-franklin-va-23851" without the attached gobble de gook, I see a good page. So bottom line its possible it could be in one of my sitemaps, but I have 50 of those so its time consuming to search thru all 50 for each error like this since I have so many. I am pretty sure its not in my sitemaps, since google has not picked up any of these errors and they have crawled over 12,000 urls so far. When google gives me a 404 error I can click on the link and find what pages they found the link and go there and correct it at the root. Any suggestions would be greatly appreciated. I have more than 1,000 of these errors with the bad url with the junk attached to the end and have not been able to isolate the cause yet. Thanks in advance.
Moz Pro | | tommytx0 -
Wild fluctuation in number of pages crawled
I am seeing huge fluctuations in the number of pages discovered the crawl each week. Some weeks the crawl discovers > 10,000 pages and other weeks I am seeing 4-500. So, this week for example I was hoping to see some changes reflected for warnings from last weeks report (which discovered > 10,000 pages). However, the entire crawl this week was 448 pages. The number of pages discovered each week seems to go back and forth between these two extremes. The more accurate count would be nearer the 10,000 mark than the 400 range. Thanks. Mark
Moz Pro | | MarkWill0 -
Question about when new crawls start
Hi everyone, I'm currently using the trial of seomoz and I absolutely love what I'm seeing. However, I have 2 different websites (one has over 10,000 pages and one has about 40 pages). I've noticed that the smaller website is crawled every few days. However, the larger site hasn't been crawled in a few days. Although both campaigns state that the sites won't be crawled until next Monday, is there any way to get the crawl to start sooner on the large site? The reason that I've asked is that I've implemented some changes that will likely decrease the amount of pages that are crawled simply based upon the recommendations on this site. So, I'm excited to see the potential changes. Thanks, Brian
Moz Pro | | beeneeb0