Duplicate content issues caused by our CMS
-
Hello fellow mozzers,
Our in-house CMS - which is usually good for SEO purposes as it allows all the control over directories, filenames, browser titles etc that prevent unwieldy / meaningless URLs and generic title tags - seems to have got itself into a bit of a tiz when it comes to one of our clients.
We have tried solving the problem to no avail, so I thought I'd throw it open and see if anyone has a soultion, or whether it's just a fault in our CMS.
Basically, the SEs are indexing two identical pages, one ending with a / and the other ending /index.php, for one of our sites (www.signature-care-homes.co.uk).
We have gone through the site and made sure the links all point to just one of these, and have done the same for off-site links, but there is still the duplicate content issue of both versions getting indexed.
We also set up an htaccess file to redirect to the chosen version, but to no avail, and we're not sure canonical will work for this issue as / pages should redirect to /index.php anyway - and that's we can't work out. We have set the access file to point to index.php, and that should be what should be happening anyway, but it isn't.
Is there an alternative way of telling the SE's to only look at one of these two versions?
Also, we are currently rewriting the content and changing the structure - will this change the situation we find ourselves in?
-
Hi Nick,
Given that you have tried all of the above, I recommend cutting off the search engines at the source, in your Robots.txt.
Once you manually exclude the page in your robots.txt doc, the search engines will no longer crawl and index the page. After enough time passes, the page should disappear from the SE's cache.
Here is a moz tutorial for how to exclude the page: Robot's Exclusion Protocol
Just a heads up..you may want to give it a week or so for the SEs to catch up on all the work you have already done to resolve the issue. Then try the above solution.
Good luck!
-
You have redirected the index.php version to the / version and it doesn't work? Sounds like you made an error in your .htaccess file then. Make sure your redirects are correct and that every index.php redirects to the / version of the URL and then use the canonical tag to specify the / version as the one you want. Wait a couple weeks and it should get fixed just fine. If it isn't, you probably didn't set up the 301 redirects properly.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does duplicate content not concern Rand?
Hello all, I'm a new SEOer and I'm currently trying to navigate the layman's minefield that is trying to understand duplicate content issues in as best I can. I'm working on a website at the moment where there's a duplicate content issue with blog archives/categories/tags etc. I was planning to beat this by implementing a noindex meta tag on those pages where there are duplicate content issues. Before I go ahead with this I thought: "Hey, these Moz guys seem to know what they're doing! What would Rand do?" Blogs on the website in question appear in full and in date order relating to the tag/category/what-have-you creating the duplicate content problem. Much like Rand's blog here at Moz - I thought I'd have a look at the source code to see how it was dealt with. My amateur eyes could find nothing to help answer this question: E.g. Both the following URLs appear in SERPs (using site:moz,com and very targeted keywords, but they're there): https://moz.com/rand/does-making-a-website-mobile-friendly-have-a-universally-positive-impact-on-mobile-traffic/ https://moz.com/rand/category/moz/ Both pages have a rel="canonical" pointing to themselves. I can understand why he wouldn't be fussed about the category not ranking, but the blog? Is this not having a negative effect? I'm just a little confused as there are so many conflicting "best practice" tips out there - and now after digging around in the source code on Rand's blog I'm more confused than ever! Any help much appreciated, Thanks
Technical SEO | | sbridle1 -
301 duplicate content dynamic url
I have a number of pages that appear as duplicate titles in google webmaster. They all have to do with a brand name query. I want to 301 these pages since I'm going to relaunch my new website on wordpress and don't want to have 404s on these pages. a simple 301 redirect doesn't work since they are dynamic urls. here is an example: /kidsfashionnetherlands/mimpi.html?q=brand%3Amim+pi%3A&page=2&sort=relevance /kidsfashionnetherlands/mimpi.html?q=mim+pi&page=3&sort=relevance /kidsfashionnetherlands/mimpi.html?q=mim+pi&page=5&sort=relevance should all be 301 to the original page that I want to remain indexed: /kidsfashionnetherlands/mimpi.html I have a lot of these but for different queries. Should I do a 301 on each of them to avoid having 404s when I change my site to wordpress? Thanks
Technical SEO | | dashinfashion0 -
Is there an easy solution for duplicate page content on a drupal CMS?
I have a drupal 7 site www.australiacounselling.com.au that has over 5000 crawl errors (!). The main problem - close to 3000 errors- is I have duplicate page content. When I create a page I can create a URL alias for the page that is SEO friendly, however every time I do this, it is registering there are 2 pages with the same content. Is there a module that you're aware of that I can have installed that would allow me to show what is the canonical page? My developers seemed stumped and have given up trying to find a solution, but I'm not convinced that it should be that hard. Any ideas from those familiar with drupal 7 would be greatly appreciated!
Technical SEO | | ClintonP0 -
Duplicate Content based on www.www
In trying to knock down the most common errors on our site, we've noticed we do have an issue with dupicate content; however, most of the duplicate content errors are due to our site being indexed with www.www and not just www. I am perplexed as to how this is happening. Searching through IIS, I see nothing that would be causing this, and we have no hostname records setup that are www.www. Does anyone know of any other things that may cause this and how we can go about remedying it?
Technical SEO | | CredA0 -
Standard Responses Causing Duplication Issues
Hi Guys We have a Q&A section on our site which we reply to customers using standard responses which have already been approved. This is causing a lot of duplication errors, however due to the nature of our business we need to use these responses. Is there anything that we can do to stop this? Matthew
Technical SEO | | EwanFisher0 -
Duplicate Content
Many of the pages on my site are similar in structure/content but not exactly the same. What amount of content should be unique for Google to not consider it duplicate? If it is something like 50% unique would it be preferable to choose one page as the canonical instead of keeping them both as separate pages?
Technical SEO | | theLotter0 -
Thin/Duplicate Content
Hi Guys, So here's the deal, my team and I just acquired a new site using some questionable tactics. Only about 5% of the entire site is actually written by humans the rest of the 40k + (and is increasing by 1-2k auto gen pages a day)pages are all autogen + thin content. I'm trying to convince the powers that be that we cannot continue to do this. Now i'm aware of the issue but my question is what is the best way to deal with this. Should I noindex these pages at the directory level? Should I 301 them to the most relevant section where actual valuable content exists. So far it doesn't seem like Google has caught on to this yet and I want to fix the issue while not raising any more red flags in the process. Thanks!
Technical SEO | | DPASeo0 -
Duplicate Content from Google URL Builder
Hello to the SEOmoz community! I am new to SEOmoz, SEO implementation, and the community and recently set up a campaign on one of the sites I managed. I was surprised at the amount of duplicate content that showed up as errors and when I took a look in deeper, the majority of errors were caused by pages on the root domain I put through Google Analytics URL Builder. After this, I went into webmaster tools and changed the parameter handling to ignore all of the tags the URL Builder adds to the end of the domain. SEOmoz recently recrawled my site and the errors being caused by the URL Builder are still being shown as duplicates. Any suggestions on what to do?
Technical SEO | | joshuaopinion0