What is Considered Duplicate Content by Crawlers?
-
I am asking this because I have a couple of site audit tools that I use to crawl a site I work on every week and they are showing duplicate content issues (which I know there is a lot on this site) but some of what is flagged as duplicate content makes no sense.
For example, the following URL's were grouped together as duplicate content:
|
https://www.firefold.com/contact-us
|
| https://www.firefold.com/sale |
|
|
How are these pages duplicate content? I am confused on what site audit tools are considering duplicate content.
Just FYI, this is data from Moz crawl diagnostics but SEMrush site auditor is giving me the same type of data.
Any help would be greatly appreciated.
Ryan
-
Yea I just started working on this site. I haven't used Moz Analytics much so just wanting to see how their crawler crawls pages.
And yes I agree, there are a lot of BIG BIG BIG issues with this site.
I got a large workload over the next few months haha.
-
I would add that there's is no text on any of those three pages - any "text" one would see there is actually just embedded in an image - which is a huge issue for a number of reasons:
- Search engines see that there's no text - a big no-no.
- You're getting practically no SEO value from the content that would be there, even if there isn't much.
- It's heavier this way - which makes load times slower.
I want to clarify that there are many, bigger issues with these pages - but as your question concerns only duplicate content, I'll leave all of that out for the time being. To summarize, Google, Yahoo, and Bing are just seeing some duplicate banners, sidebars, etc. and then some images in the body of your pages. Hence, duplicate content.
-
Thanks for that information.
It makes sense looking at the data and pages from that perspective.
-
Hi Ryan!
Our crawler will flag pages that have at least 90% similarity in the entire source code of the site so not just the body.
The way you want to interpret the report is the contact-us page has 35 duplicates, so "gabe" and "sale" are not dupes of each other in this section but are only each a duplicate of "contact-us". Those URLs might appear with their own duplicates of the same pages further down in the report.
While on the front end the pages do not appear to be similar. The issue is likely with the amount of javascript code on those pages.
Our crawler cannot read javascript so we are likely only able to see the template of the page. Other search tools are probably seeing the same thing as it returns 79% similarity using this tool: http://www.freebulkseotools.com/similar-page-checker-tool.php
I can't provide much insight from a dev perspective but hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawler triggering Spam Throttle and creating 4xx errors
Hey Folks, We have a client with an experience I want to ask about. The Moz crawler is showing 4xx errors. These are happening because the crawler is triggering my client's spam throttling. They could increase from 240 to 480 page loads per minute but this could open the door for spam as well. Any thoughts on how to proceed? Thanks! Kirk
Moz Bar | | kbates1 -
Is MOZ any good to analyze an e-commerce site? How come that a cms page can be seen as duplicate content with a category page?
Hi Guys, I've been using Moz for quite a long time now for 2 of my shops. Now I am in the process of launching the second shop and I just don't understand how is it possible that a cms static page (About US) to be seen as a duplicate content with other 96 pages - including product pages and other totally different pages such as delivery information, category pages, returns and so on. Really MOZ?? Is it me or you?? Your help would be much appreciated! Thank you!
Moz Bar | | Sorin_T0 -
Many Duplicate Content Flags
Not sure about you all, but I’m loving the new Moz Site Crawler. However, I was noticing that it is identifying a huge amount of pages as duplicate content. There are about 30,000 pages in this website, with that said we’ve had to make many templates to make the site scalable. Additionally a url rule was lost which caused a significant amount of duplicate pages to be created. I am working through using the moz crawl tool to identify duplicate pages but noticing many pages under “Affected Pages,” are actually unique content pages with initial content that is duplicate. I read that Moz flags any pages with 90% or more content overlapping content or code. My theory for this is that some templates that are too similar, to the point that Moz reads them as duplicative. Has this happened for anyone else? In addition, if Moz is flagging these similar pages as duplicate content, do we surmise that Google bots are having the same issue? We have seen issues with rankings as it pertains to the actual duplicate pages but hadn't experienced issues across the unique pages, they are hyperlocal pages so we are able to see rankings quite easily.
Moz Bar | | HZseo0 -
MozBot Finding Duplicate Pages That Aren't Duplicate
I've been reviewing the technical audits for my campaign in Moz, and noticed I had a number of duplicate content issues that I'm not really sure how to address. When I click on the links of what the duplicates are, they are all different links that have different content/images. Based on what I was seeing other's wrote in the forum, this could be because the code base is really the same between these pages, and many of these were using query parameters (I'm assuming that is why the code is almost exactly the same across these pages), so example: website.com/tags/KEYWORD1?type=KEYWORD2 is a duplicate of website.com/tags/KEYWORD3?type=KEYWORD4 I was reading that I can use that URL Parameters area in google search console, but my search console says that the googlebot isn't experiencing issues, so I wasn't sure if that was the right move. I can't do the canonicals because these pages all have different content on them, and I know duplicate content is a big SEO issue, so I really wasn't sure what my next steps should be. Thanks for the help!
Moz Bar | | amaray4030 -
Duplicate Content on Website with Multiple Locations
Hi there, I've spent hours reading posts on duplicate content and googling this but I'm still not sure what to do. We created a site that has two WP installs for a company with two different locations - the landing page is website.com and links to WP install 1 (website.com/city1), and WP install 2 (website.com/city2). They specifically wanted two different sites so they could be managed by staff at either location. However some of the pages have the same content - ie. services, policies, etc. so all of those are showing errors for duplicate content. All pages have different city-specific URL's and meta-descriptions but that clearly doesn't help. We can't redirect the "duplicate" pages because then it would take the user to the other city's specific site. Is there anything we can do?? Is this going to significantly damage rankings? Thanks kindly for any help you can provide.
Moz Bar | | charlie0071 -
Why is the exact same URL being seen as duplicate and showing an error in my SEO reports
Well, I am still having duplicate page issues. I have a question about one of the errors SEO is giving me when I download a crawl report. I am going to attach a screen shot of part of the report so you can see for yourself, along with explaining it here. SEO shows the list of URL's that it crawled in the report. In this(see attachment) portion of the report it has 321 results for the exact same URL. It also says all of these exact same URL's have received a 404 error. What I want to know is how does it make 321 results for the same URL? And with this error that I don't see when I look at the page? 0hkRDST
Moz Bar | | JoshMaxAmps0 -
Ok, I have a strange duplicate content issue.
Just ran MOZ against my site and about fell of my chair, over 1k pages being flagged as having duplicate content. That's about every page on the site. I downloaded the CSV file and things got even more confusing, The URL's under "URL" and " Duplicate Page Content" area exactly the same. Am I missing something? I would expect the URL under Duplicate Page Content to show a different URL, to the page with the exact same content. Any help would be appreciated, traffic is tanking and I need to get this figured out. Thanks,
Moz Bar | | Trigger130 -
Crwal errors : duplicate content even with canonical links
Hi I am getting some errors for duplicate content errors in my crawl report for some of our products www.....com/brand/productname1.html www.....com/section/productname1.html www.....com/productname1.html we have canonical in the header for all three pages <link rel="canonical" href="www....com productname1.html"=""></link rel="canonical" href="www....com>
Moz Bar | | phes0