I want to create a report of only de duplicate content pages as a csv file so i can create a script to canonicalize them.
-
I want to create a report of only de duplicate content pages as a csv file so i can create a script to canonicalize them. So i get something like:
http://example.com/page1, http://example.com/page2, http://example.com/page3, http://example.com/page4,
Because I now have to open each in "Issue: Duplicate Page Content", and this takes a lot of time.
The same for duplicate page title.
-
Hi nvs.nim,
could you tell me what you did differently? I also get an empty AF column.
-
Thanks! Because excel didn't seperate the fields right i didn't have the column AF. But i got it now! Thanks a lot!
-
Josh is right - when you export as CSV there should be a column in the spreadsheet -
|
duplicate_page_content
This column contains all the URLS that are considered duplicates
|
-
Yes it does, in column AF there is a list of Duplicate Page Content URLs
-
It doesn't tell me what other pages are identical. Only that there are identical pages.
-
Well.. SEOMoz Pro does it! Just check out the Crawl Diagnostics -> Duplicate Page Content then go to the top right and Export as CSV!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Parameters causing duplicate content - Login/Registration page
All, I just recently acquired a new client and right away I noticed an abundance of duplicate content being recorded after the moz crawl diagnostics was completed. After a quick digest of the issue, it seems that the majority (90%) of the outlined duplicated content is stemming from the client's Login/Registration page. Upon clicking (without being logged-in) any asset or forum discussion board link within the site, the user is automatically redirected to the Login/Registration page, which seems to create this massive redirect loop associated with dynamic url parameters. Ex. After clicking on a select internal link (asset or discussion board) the user is redirected to the Login/Register page which presents the page and a URL that looks a lot this this: Ex. 1 https://www.clientsite.com/register-login?ReturnUr...xxxx%xxxx%xxxx%...... Ex. 2 https://www.clientsite.com**/register-login?returnurl=/register-login?returnurl=/register-login?returnurl=/page-titl**e/ These URLs seem to becoming larger and larger... The client wants to ensure users have to Login/Register within their site before they're allowed to view the content. This process doesn't allow for any type of preview page to be viewed by a user prior to clicking on the internal link, which in turn doesn't allow any preview pages to be indexed. Right now, Moz is picking up all of the redirect and labeling them as duplicate page content/duplicate page titles based on the Login/Registration page. Questions/Comments: Would it be wise to create preview pages for the asset pages and discussion board pages to allow for proper indexing? - Could this be a CMS issue? Current being used on this is, Kentico. There are thousands of pages being recorded in the crawl as duplicate, however only 14 seem to be indexing with duplicate title tags. 301 or canonical redirect strategy? Moz crawl data issue? Again, this is my first look at this issue, so more information is bound to come out soon! Please let me know if anyone has run into this issue and if you have a possible solution to get rid of this redirect loop process. Thanks! -T
Moz Pro | | MattLacuesta0 -
Why am I getting all these duplicate pages?
This is going for basically all my pages, but my website has 3 'duplicates' as the rest just have 2 (no index) Why are these 3 variations counting as duplicate pages? http://www.homepage.com http://homepage.com http://www.hompage.com/index.php
Moz Pro | | W2GITeam0 -
"On-Page Report Card"- why is still showing " F grade" after introducing the keyword in page and title.
Hello, "On-Page Report Card"- why is still showing " F grade" after introducing the keyword in page and title. After changing the title and putting the keyword inside the title, in this section, "Exact Keyword Usage in Page Title", it shows the first title, without updating my changes. I have updated several times. In some cases worked, in this case doesn't. For example "online project management software" grades F, and "project management software" grades A, even if I've put the "online" word in title an so on. Now I have the same issue with "stock management software" which grades F. "stock management" grades A, even if i've put exactly "stock management software" thanks.
Moz Pro | | directspark0 -
On-Page Report Card B grade because its a PPC landing page
I have a PPC landing page with I'm getting a B grade on the On-Page Report Card. Can I just ignore that, it says its a "Critical Factor" Thanks Mike Crawl status <dd>Status Code: 200
Moz Pro | | mjrinvent
meta-robots: noindex,nofollowall
meta-refresh: None
X-Robots: None</dd> <dt>Explanation</dt> <dd>Pages that can't be crawled or indexed have no opportunity to rank in the results. Before tweaking keyword targeting or leveraging other optimization techniques, it's essential to make sure this page is accessible.</dd> <dt>Recommendation</dt> <dd>Ensure the URL returns the HTTP code 200 and is not blocked with robots.txt, meta robots or x-robots protocol (and does not meta refresh to another URL)</dd>0 -
How can I see stats/reports from previous months
Hi, Simple question. Where can I see stats/reports from previous months from my campaigns? For example, I want to know if my Organic Traffic has improved in the last three months. Sorry, I'm new with this and couldn't find it anywhere :*( Thanks in advance. Cheers,
Moz Pro | | PacificFrank
Frank0 -
How do I fix a duplicate content error with a top level domain?
Hi, I'm getting a duplicate content error from the SEOmoz crawler due to an issue with trailing slashes. It's showing www.milengo.com and www.milengo.com/ as having duplicate page titles. However I'm pretty sure this has been fixed in the .htaccess file since if you type in the domain with a trailing slash it automatically redirects to the domain without a trailing slash, so this shouldn't be an issue. I'm stuck here. Any ideas? Thanks. Rob
Moz Pro | | milengo0 -
Duplicate Content Issue from using filters on a directory listing site
I have a directory listing site of harpists and have alot of issues coming up that say: Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings. Because this is a directory listing site the content is quite generic.The main issue appears to be coming from the functionality of the page. It appears that the "spider" is picking up each different choice of filter as a new page? If you have a look at this link you will see what I mean. People searching the site can filter the results of the songs played by this harpist by changing the dropdowns etc... but for some reason the filter arguments are being picked up...? Do you have any good approaches to solving this issue? A similar issue comes from the video pages for each harpist. They are being flagged as identical content - as there are currently no videos on the page. | http://www.find-a-harpist.co.uk/user/39/videos | http://www.find-a-harpist.co.uk/user/37/videos | Do you have any suggestions? Many thanks for taking the time to read this and respond. | | | | | |
Moz Pro | | dseo241
| |0 -
Blogger Duplicate Content? and Canonical Tag
Hello: I previously asked this question, but I would love to get more perspectives on this issue. In Blogger, there is an archive page and label(s) page(s) created for each main post. Firstly, does Google, esp. considering Blogger is their product, possibly see the archive and tag pages created in addition to the main post as partial duplicate content? The other dilemma is that each of these instances - main post, archive, label(s) - claim to be the canonical. Does anyone have any insight or experience with this issue and Blogger and how Google is treating the partial duplicates and the canonical claims to the same content (even though the archives and label pages are partial?) I do not see anything in Blogger settings that allows altering these settings - in fact, the only choices in Blogger settings are 'Email Posting' and 'Permissions' (could it be that I cannot see the other setting options because I am a guest and not the blog owner?) Thanks so much everyone! PS - I was not able to add the blog as a campaign in SEOmoz Pro, which in and of itself is odd - and which I've never seen before - could this be part of the issue? Are Blogger free blogs not able to be crawled for some reason via SEOmoz Pro?
Moz Pro | | holdtheonion0