Sorting Dupe Content Pages
-
Hi,
I'm no excel pro, and I'm having a bit of a challenge interpreting the Crawl Diagnostics export .csv file.
I'd like to see at a glance which of my pages (and I have many) are the worst offenders for dupe content – ie. which have the most "Other URLs" associated with them.
Thanks, would appreciate any advice on how other people are using this data, and/or how 'Moz recommends to do it.
-
CMC is correct - thats how I do it for larger sites.
- delete all columns except the URL column (col A) and the duplicate pages column (now Col B)
- in cell C2, enter this formula: =len(b2) it will calculate the characters in dupe pages cell
- drag that cell down to last row
- select all three columns and sort col c by largest to smallest
Obviously this isn't going to give you an exact number of dupe pages since URL text strings can vary in length, but it does give you a pretty good idea of the worst offenders....
-
I've found this a little frustrating, too. The display on the web will show the number of duplicate URLs, but the exported spreadsheet does not. It does, however, list all of the duplicate URLs in one cell -- so you could calculate the character length of that cell and then sort by that column, and that would give you a rough ranking.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content - Product Categories
Dears, I've use "Site Crawl" tool to find any SEO warnings, and I found 991 duplicated content. The problem is that the pages are not duplicated its all products category pages, please check this exmaple: This page: https://www.jobedu.com/en/shop/category/prints/Postcards/na/all-colors/all-size and this page: https://www.jobedu.com/en/shop/category/Accessories/keychain/na/all-colors/all-size It said its duplicated, and it's 991 pages! How to fix this this? what I can do?
Moz Pro | | jobedu0 -
Are AMP pages affecting mobile search visibility?
Hello fellow Mozzers. I've recently seen a fairly hefty drop in search visibility on Google mobile, from 12.8% to 4.1%. Desktop visibility is unaffected. The same search visibility drop is echoed in SEMRush. However, Google Analytics shows that our site traffic from mobile hasn't changed. The only thing I can think of is that we recently launched AMP pages. I know Google sometimes caches AMPs so they’re served off google domains. Could that mean that the cached version of the page is ranking rather than our own? That would explain the drop in visibility but stable traffic I think?! What other explanation could it be? Many thanks in advance, Kit
Moz Pro | | KitSmith0 -
New pages on my web site
I have created web sites that appear somewhere on Google in hardly any time at all, but I appear to have forgotten something or things are different for pages added recently to an existing website. I have added a page on a particular subject, optimized it using on page grader, so that I get an A, and a check mark for everything except H1 tags and rel=canonical which my web hosting provider does not support. I do have a check mark for accessible to search engines The page has the format http://www.domain.com/specific-keyword It is in the menu, so should have internal links to it, as I understand it. I have created a new site map, and submitted it in webmaster tools. Interestingly it says that of the 96 pages only 76 were indexed is this a clue? and why would they not index a page I have then shared the page on google plus, facebook, tumblr, pinterest and twitter and some others In OSE it comes up as domain authority 28 page authority 1, the social media shares do show up in metrics on the right but no links internal or external are shown, they do on other pages I created in the same way. Is it just a case of waiting or is their something I do to help thank you
Moz Pro | | singingtelegramsuk0 -
How can I correct this massive duplicate content problem?
I just updated a clients website which resulted in about 6000 duplicate page content errors. The way I set up my clients new website is I created a sub folder calles blog and installed wordpress on that folder. So when you go to suncoastlaw.com your taken to an html website, but if you click on the blog link in the nav, your taken to the to blog subfolder. The problem I'm having is that the url's seem to be repeating them selves. So for example, if you type in in http://suncoastlaw.com/blog/aboutus.htm/aboutus.htm/aboutus.htm/aboutus.htm/ that somehow is a legitimate url and is being considered duplicate content of of http://suncoastlaw.com/aboutus.htm/. This repeating url only seems to be a problem when the blog/ is in the url. Any ideas as to how I can fix this?
Moz Pro | | ScottMcPherson0 -
Duplicate Content and Titles in SEOMoz reports
I've had to rename some of the pages on my site and also move them to different locations. I placed a rel="canonical" on the old page pointing to the new one. The reports on my PRO Dashboard are telling me that I have Duplicate Content and Page Title errors. Do the SEOMoz automated reports take the rel="canonical" link into consideration or do I need to remove these pages and do a 301 redirect from the old to the new page?
Moz Pro | | TRICORSystems0 -
Why are these pages considered duplicate page content?
A recent crawl diagnostic for a client's website had several new duplicate page content errors. The problem is, I'm not sure where the error comes from since the content in the webpage is different from one another. Here's the pages that SEOMOZ reported to have duplicate page content errors: http://www.imaginet.com.ph/wireless-internet-service-providers-term http://www.imaginet.com.ph/antivirus-term http://www.imaginet.com.ph/berkeley-internet-name-domain http://www.imaginet.com.ph/customer-premises-equipment-term The only thing similar that I see is the headline which says "Glossary Terms Used in this Site" - I hope that the one sentence is the reason for the error. Any input is appreciated as I want to find out the best solution for my client's website errors. Thanks!
Moz Pro | | TheNorthernOffice790 -
How to check Page Authority in bulk?
Hey guys, I'm on the free trial for SEOmoz PRO and I'm in love. One question, though. I've been looking all over the internet for a way to check Page Authority in bulk. Is there a way to do this? Would I need the SEOmoz API? And what is the charge? All I really need is a way to check Page Authority in bulk--no extra bells and whistles. Thanks, Brandon
Moz Pro | | thegreatpursuit0 -
SEOmoz Bot indexing JSON as content
Hello, We have a bunch of pages that contain local JSON we use to display a slideshow. This JSON has a bunch of<a links="" in="" it. <="" p=""></a> <a links="" in="" it. <="" p="">For some reason, these</a><a links="" that="" are="" in="" json="" being="" indexed="" and="" recognized="" by="" the="" seomoz="" bot="" showing="" up="" as="" legit="" for="" page. <="" p=""></a> <a links="" that="" are="" in="" json="" being="" indexed="" and="" recognized="" by="" the="" seomoz="" bot="" showing="" up="" as="" legit="" for="" page. <="" p="">One example page this is happening on is: http://www.trendhunter.com/trends/a2591-simplifies-product-logos . Searching for the string '<a' yields="" 1100+="" results="" (all="" of="" which="" are="" recognized="" as="" links="" for="" that="" page="" in="" seomoz),="" however,="" ~980="" these="" json="" code="" and="" not="" actual="" on="" the="" page.="" this="" leads="" to="" a="" lot="" invalid="" our="" site,="" super="" inflated="" count="" on-page="" page. <="" span=""></a'></a> <a links="" that="" are="" in="" json="" being="" indexed="" and="" recognized="" by="" the="" seomoz="" bot="" showing="" up="" as="" legit="" for="" page. <="" p="">Is this a bug in the SEOMoz bot? and if not, does google work the same way?</a>
Moz Pro | | trendhunter-1598370