I want to create a report of only de duplicate content pages as a csv file so i can create a script to canonicalize them.
-
I want to create a report of only de duplicate content pages as a csv file so i can create a script to canonicalize them. So i get something like:
http://example.com/page1, http://example.com/page2, http://example.com/page3, http://example.com/page4,
Because I now have to open each in "Issue: Duplicate Page Content", and this takes a lot of time.
The same for duplicate page title.
-
Hi nvs.nim,
could you tell me what you did differently? I also get an empty AF column.
-
Thanks! Because excel didn't seperate the fields right i didn't have the column AF. But i got it now! Thanks a lot!
-
Josh is right - when you export as CSV there should be a column in the spreadsheet -
|
duplicate_page_content
This column contains all the URLS that are considered duplicates
|
-
Yes it does, in column AF there is a list of Duplicate Page Content URLs
-
It doesn't tell me what other pages are identical. Only that there are identical pages.
-
Well.. SEOMoz Pro does it! Just check out the Crawl Diagnostics -> Duplicate Page Content then go to the top right and Export as CSV!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How can I deal with tag page duplicate issues
The Moz crawler reported some dupliated issues. Many of them have to do with tags.
Moz Pro | | IamKovacs
Each tag has a link, and as some articles are under several tags, these come up as duplicate content. I read Dr Peter's piece on Canonical stuff, but it's not clear to me if any of these are the solution. Perhaps the solution lies somewhere else? Maybe I need to block the robots from these urls (But that seems counter-SEO-productive) Thanks
Kovacs0 -
Codeigniter - Controller and duplicate pages
Hi there, I use Codeigniter as framework and I have a question about the duplicate page. Actually, for default, the typical page in a CodeIgniter framework is something like this: http://www.domain.com/site/contact where site is the controller containing the contact function that point to the contact.html view... To have a better URL I use a trick with the "routes" that redirect any http://www.domain.com/contact to the original http://www.domain.com/site/contact Of course the both are valid and the both are... crawled! So I get the duplicate page. Is this something I have to manage, maybe with .htaccess? Any idea would be very appreciated. Thanks for you precious time guys! Shella
Moz Pro | | CarloShellaMascella0 -
I have double-checked the rel canonical is properly employed on our page but the On Page Grader says it's not working?
I have double-checked the rel canonical is properly employed on our page but the On Page Grader says it's not working Here is the URL - http://www.solidconcepts.com/industries/aerospace-parts-manufacturing/ What is wrong with how we are doing things?
Moz Pro | | StratasysDirectManufacturing0 -
SEOmoz giving duplicate content that does not exist.
My problem is similar, and SEOmoz add campaign is giving me several pag. Duplicate, and he's giving me links pag. That do not exist. Look below. My site has 115 pages and the extent SEMOZ gave me 250. Duplicate Page Content ... pages / Alexandra / Clarisse / Clarisse.html
Moz Pro | | Slash-RJ
... pages / Alexandra / Clarisse / Clarisse / Clarisse.html
... pages / Alexandra / Clarisse / Clarisse / Clarisse / Clarisse.html
.... pages / Alexandra / Clarisse / Clarisse / Clarisse / Lizie / Lizie.html When the verade this link does not exist, there is only. ... pages / Alexandra / Alexandra.html
... pages / Clarisse / Clarissehtml
And so on. How to Solve?0 -
I need to get a page in the top 3 Google results for my keyword "teaching jobs" but am struggling to do so! Can anyone help?
I'm trying to get this page http://www.eteach.com/teaching-jobs to rank as the top search result on Google with the keyword "teaching jobs" but it seems to be number 5 in the results! My competitors are totally kicking my arse on getting this page to be above my website. I've got the keywords in there, I have the right content and I have links, what more can I do to make it rank as number 1! Help please!! If anyone has an SEO check list of things I need to make sure I do on my pages for them to rank in the top 3 results then that would be really handy!
Moz Pro | | Eteach_Marketing0 -
How can i export a historical ranking report which contains keywords with special characters.
How can i export a historical ranking report which contains keywords with special characters? Previously it only turned the keywords into jumbled format (forgot the tech term) in excel i.e. онлайн чат which are in this case russian characters. The way i got around this was to import it into google docs but that also is now converting it into this format. Due to this i 1 do not know what the keywords are and 2 all my formulas do not work.
Moz Pro | | ColumK0 -
How to resolve Duplicate Content crawl errors for Magento Login Page
I am using the Magento shopping cart, and 99% of my duplicate content errors come from the login page. The URL looks like: http://www.site.com/customer/account/login/referer/aHR0cDovL3d3dy5tbW1zcGVjaW9zYS5jb20vcmV2aWV3L3Byb2R1Y3QvbGlzdC9pZC8xOTYvY2F0ZWdvcnkvNC8jcmV2aWV3LWZvcm0%2C/ Or, the same url but with the long string different from the one above. This link is available at the top of every page in my site, but I have made sure to add "rel=nofollow" as an attribute to the link in every case (it is done easily by modifying the header links template). Is there something else I should be doing? Do I need to try to add canonical to the login page? If so, does anyone know how to do it using XML?
Moz Pro | | kdl01 -
Crawl Diagnostics Report
I'm a bit concerned about the results I'm getting from the Crawl Diagnostics Report. I've updated the site with canonical urls to remove duplicate content and when I check the site - it all displays the right values, but the report, which has just finished crawling is still showing a lot of pages as duplicate content. Simple example: http://www.domain.com http://www.domain.com/ Both of them are in the duplicate content section although both have canonical url set as: Does each crawl check the entire site from the beginning or just the pages it didn't have a chance to crawl the last time? This is just one of 333 duplicate content pages, which have canonical url pointing to the right page. Can someone please explain?
Moz Pro | | coremediadesign0