Moz Crawler suddenly reporting 1000s of duplicates (BE.net)
-
In the last 3-4 days we've had several thousand 'duplicate content' warnings appear in our crawl report, 99% of them related to our on-site blog. The blog is BlogEngine.Net, but the pages simply don't exist. The majority seem to be Roger trying quasi-random URLs like:
/?page=410/?page=151
Etc. etc. The blog will present content for these requests, but it is of course the same empty page since there's only unique content for up to /?Page=10 or so.
Two questions:
1. Did something change recently? These blogs have been up for months, and this problem has only come up this week. Did Roger change to become more aggressive lately?
2. Suggested remediation? On one of the blogs I've put no-index no-follow for any page that has a /?page querystring, and we'll see what effect that has come next crawl next week. However, I'm not sure this will work as per:
http://moz.com/community/q/functionality-of-seomoz-crawl-page-reports
Anyone else had dynamic blogs suddenly blossom into thousands of duplicate content warnings? Google (rightly) ignores these pages completely.
-
Hate to bump my own question, but it appears I spoke too soon about no-index,no-follow solving this. The duplicate errors went away for about 5 days, but then yesterday spiked with the same problem. I've confirmed that no-index, no-follow are present on the pages being detected as bad.
As per the best practices document:
http://moz.com/learn/seo/robotstxt
Using meta robots no index no follow is the recommended option:
Block with Meta NoIndex
This tells engines they can visit, but are not allowed to display the URL in results. This is the recommended method
But it apparently isn't working, as evidenced by the new surge of duplicate errors. Is there anything else I can do? I don't want to explicitly block Roger in robots.txt as that seems rather backward. Should Roger be included the Bad Robots List?
-
Peter -
Thanks for the clarification. I understand the philosophy at hand, and I kind of even understood it before I had asked the question. I'm handling these with a mix of canonical and no-index/no-robot.
Related to that, update:
By marking the superfluous pages no-index/no-follow the error count for the site has diminished by about 10,000 and the warning count by about 28,000 so that seems to be the way to go. The pages that had content are 'low value' in this context, since that content was readily available elsewhere.
-
Hi there!
Thanks for writing in with a great question.
We definitely count those dynamic URLs as duplicate content. While we are pretty sure that search engines can figure this stuff out and know which URL to index, it's still considered best practices to canonicalize or otherwise direct crawlers to the original URL (as far as I know. I'm not a professional SEO so you might be better off asking the Pro Q&A community at www.moz.com/community/q - they are all SEOs like you).
Since some dynamic URL generators can cause problems for crawlers, we do try to be overly-inclusive of these issues rather than overly-exclusive. We want people to know about potential issues with sites, even if they're not really issues in the scheme of the site owner's specific SEO implementation plan.
In sum, we'd rather leave those judgments up to you and at the same time, provide you with the data you need to make these decisions. I hope this helps explain our thinking here! However, if you think that our crawler might be having issues, and you do not want to post your site urls here you could always send us a support ticket at help@moz.com. That way can can examine it a bit further and provide some insights into why our crawler thinks this way!
Hope this helps!
Peter
Moz Help Team.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Solved Is there a way to remove the Moz branding from automated PDF reports that are emailed to clients?
Is there a way to remove the Moz branding from automated PDF reports that are emailed to clients?
Reporting & Analytics | | ArttiaCreative0 -
Google Analytics Automated Reporting
HI all, I tend to do a big reporting powerpoint deck using screenshots from google analytics and tables I create year end and mid year. It's like an 80 page report for the 10 webisite swe have and then I go ahead and make annotations as I see from the data. That being said this can take a lot of time, up to a 40 hours of time to pull it all together or more which is challenging when you have daily meetings. Anyhow, I've looked into automating and tried a couple things: 1. Tableau- but it keeps crashing and seems tedious 2. Dashlane and supergrabber- seem a bit tedious to set up too. Anyone have ideas on how to better shar ereporting in the organization in this type of format for a website (websites)? Organic, paid, traffic, etc. Laura
Reporting & Analytics | | lauramrobinson322 -
How do I fix apparent duplicates
I'm auditing a site and would appreciate your help with possible explanations and solutions as to why Google Analytics in the Content Drilldown page is showing what appears to be duplicate pages. (Refer image) I'm wondering if I have got my head around the rel=canonical tag because the page I'd consider a duplicate "page/" has a Canonical tag pointing to "~/page.html" This is the tag from the page Locations/ rel="canonical" href="http://www.domain.com/Locations.html" /> so am unsure why both versions of the page are generating views. Shouldn't the Canonical tag work like a 301 redirect? I'm unsure how the pages using the path page/ are generating so many views because I have not been able to find them and they are not indexed by Google. Unfortunately the site is built using a Propriety CMS I'm not familiar with. exK4EqrU25
Reporting & Analytics | | NicDale0 -
SEO Moz Errors
We have SEO Moz Errors and warnings showing up, yet we have cleaned them
Reporting & Analytics | | RNK
up. The same errors were showing up in Google's Webmaster tools but after we corrected them they do not show up as crawl errors in Webmaster tools.
Why is SEO Moz different and why does it continue to show corrections already made.0 -
When will traffic data be working ? also whats with the spike in duplicate listing issues with everyone.
Hi There, We have no traffic data, is this something we are doing wrong or is this an issue with SEOMOZ ? Also duplicate listings have gone sky high, check goggle analytics's and all ok ? Any answers ? Thanks Charlie
Reporting & Analytics | | pro580 -
Moz Rank & Trust | Page vs Sub vs Root
Hey guys, Just need some help deciphering my OSE link metrics for my site theskimonster.com . Page MozRank: 5.51 (highest among my competitors) Page MozTrust: 5.74 (#2 among my competitors) Subdomain MozRank: 4.19 (#4 among my competitors) Subdomain MozTrust: 4.63 (#2 among my competitors) Root Domain MozRank: 3.89 (#5 or last place among competitors) Root Domain MozRank: 4.1 (#5 or last place among competitors) What does this mean? What am I doing right, what do I need to do?
Reporting & Analytics | | Theskimonster1 -
How serious are the Duplicate page content and Tags error?
I have a travel booking website which reserves flights, cars, hotels, vacation packages and Cruises. I encounter a huge number of Duplicate Page Title and Content error. This is expected because of the nature of my website. Say if you look for flights between Washington DC and London Heathrow you will at least get 60 different options with same content and title tags. How can I go about reducing the harm if any of duplicate content and meta tags on my website? Knowing that invariably I will have multiple pages with same content and tags? Would appreciate your advice? S.H
Reporting & Analytics | | sherohass0 -
Duplicate page content
I have a website which "houses" five different and completely separate departments, so the content is separated by subfolders. e.g. domain.com/department1 domain.com/department2 etc. and each have their own individual top navigation menus. There is an "About Us" section for each department which has about 6 subpages (Work for us, What we do, Awards etc.) but the problem is that the content for each department is exactly the same. The only difference is the navigation menu and the breadcrumbs. This isn't ideal as a change to one page means having to make the change to all 5 and from an SEO perspective it's duplicate content x5 (apart from the Nav). One solution I can see is to have the "About Us" section moved to the root level (domain.com/about-us) and have a generic nav, possibly with the department names on it. The only problem with this is that it disrupts the user journey if they are forced away from the department that they're chosen. Basically i'm looking for suggestions or examples of other sites that have got around this problem, I need inspiration! Any help would be greatly appreciated.
Reporting & Analytics | | haydennz0