Duplicate Content

Prop65

I am trying to get a handle on how to fix and control a large amount of duplicate content I keep getting on my Moz Reports.

The main area where this comes up is for duplicate page content and duplicate title tags ... thousands of them.

I partially understand the source of the problem. My site mixes free content with content that requires a login. I think if I were to change my crawl settings to eliminate the login and index the paid content it would lower the quantity of duplicate pages and help me identify the true duplicate pages because a large number of duplicates occur at the site login.

Unfortunately, it's not simple in my case because last year I encountered a problem when migrating my archives into a new CMS. The app in the CMS that migrated the data caused a large amount of data truncation Which means that I am piecing together my archives of approximately 5,000 articles. It also means that much of the piecing together process requires me to keep the former app that manages the articles to find where certain articles were truncated and to copy the text that followed the truncation and complete the articles. So far, I have restored about half of the archives which is time-consuming tedious work.

My question is if anyone knows a more efficient way of identifying and editing duplicate pages and title tags?

Prop65

What you're describing sounds like registration spam, which can be a royal pain. About two years ago spammers found a vulnerability in my CMS where they used bots to inject the spam into registration fields that ask for a user profile, I ended up removing write access for the user profiles and added ReCaptcha for all registrations. When it first happened my site had more than 8,000 bot registrations overnight. Good times!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Duplicate Content

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Purchasing duplicate content

Does duplicate content not concern Rand?

Fullsite=true coming up as duplicate content?

How to avoid duplicate content

Duplicate Page content / Rel=Cannonical

Testing for duplicate content and title tags

The Bible and Duplicate Content

Does 'framing' a website create duplicate content?