How to resolve Duplicate Page Content issue for root domain & index.html?
-
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
-
great code Josh...but , after i saved it on .htaccess , a "?" appeared on the link..
http://www.domain.com/?/example/file.html
Is this ok ? pls advice/
Thank you,
-
You touched on a good point here "We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps."
Too often I see sites where they get the home page right but miss the re-write on the directories.
-
Here's the .htaccess rewrite command that you can use for the index.html redirect -
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.amarasoftware.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.amarasoftware.com/$1/ [R=301,L]
We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps.
-
I'd check it with some other software too... i.e. Raven Tools free trial or something, that will tell you if there's canonicalization problems... of course I'm not advocating Raven Tools over SEOmoz tools (I'm a member here and not there for good reasons), I just think best to try a few different tests before deciding if it's a problem. There might just be an issue with the SEOmoz campaign tool for the moment, which I'm sure they'll fix as soon as they realise.
Hey, aren't you the tutor I had in my SEC usability course?
-
Unfortunately I can't speak for how SEOmoz handles rewrites like this if it's already crawled the page.
The rewrite rule you're using looks like it's only rewriting the www portion of the URL, not index.html. So alone it wouldn't do anything to solve dupe content issues. (someone please correct me if I'm misreading the rewrite rule)
Here's a link to what I used to write a redirect for index.html on another site.
http://www.webmasterworld.com/forum92/6375.htm
I think it is a fairly safe assumption to make that SEOmoz is smart enough to realize if you're got a redirect in there (providing that its working). I'd still recommend taking a look to see if Google has cached or indexed an index.html version, though.
Edit: my personal, highly technical, acid-test for an index.html redirect is just going there and manually entering the url with index.html on the end, rather than waiting for a recrawl to see if you're heading in the right direction.
-
RewriteEngine on RewriteCond %{HTTP_HOST} ^([a-z.]+)?amarasoftware.com$ [NC] RewriteCond %{HTTP_HOST} !^www. [NC] RewriteRule .? http://www.%1amarasoftware.com%{REQUEST_URI} [R=301,L] Is what I use. In Seomoz this leads to www.amarasoftware.com and index.html so 2 different URL's, both with different incoming links, and a different authority, which has an impact on my ranking if correct. in SEomoz this a returns a duplicate title and meta tags errors. If SEOmoz finds 2 pages instead of one I may assume that Google agrees with this.
-
As you did, I'd normally handle this with a 301 from index.html to the root domain. When you say that it's "not had an impact" do you mean that the SEOmoz dashboard continues to show an error after it re-crawls, or that the search engines are not picking up the redirect?
SEOmoz dashboard does a great job, but I'd check to see how the search engines are actually indexing yourdomain.com/index.html vs. yourdomain.com also. If the search engines are indexing it as you want them to, then I'd be inclined to ignore the dashboard error.
I apologize if this is a stupid question, but I assume you manually checked that the redirect worked?
-
You wish to canonicalize the pages. That is the SEO word which describes exactly what you are trying to achieve.
Above are 5 URLs which can possibly lead to the exact same page. If you add the following HTML in the code then the pages will be canonicalized.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why Would My Page Have a Higher PA and DA, Links & On-Page Grade & Still Not Rank?
The Search Term is "Alcohol Ink" and our client has a better page authority, domain authority, links to the page, and on-page grade than those in the SERP for spaces 5-10 and we're not even ranked in the top 51+ according to Moz's tracker. The only difference I can see is that our URL doesn't use the exact text like some of the 5-10 do. However, regardless of this, our on-page grade is significantly higher than the rest of them. The one thing I found was that there were two links to the page (that we never asked for) that had a spam score in the low 20's and another in the low 30's. Does anyone have any recommendations on how to maybe get around this? Certainly, a content campaign and linking campaign around this could also help but I'm kind of scratching my head. The client is reputable, with a solid domain age and well recognized in the space so it's not like it's a noob trying to get in out of nowhere.
Intermediate & Advanced SEO | | Omnisye0 -
Sub-domain vs Root domain
I have recently taken over a website (website A) that has a domain authority of 33/100 and is linked to from 39 root domains. I have not yet selected any keywords to target so am currently unsure of ranking positions. However, website A is for a division of a company that has its own separate website (website B) which has a domain authority of 58/100 and over 1000 legitimate linking root domains. I have the option of moving website A to a sub-domain of website B. I also have the option of having website B provide a followed link to website A. So, my question is, for SEO purposes, is my website better off remaining on its own existing domain or is it likely to rank higher as a sub-domain of website B? I am sure there are pros and cons for both options but some opinions would be much appreciated.
Intermediate & Advanced SEO | | BallyhooLtd0 -
Consolidating two different domains to point at same site, duplicate content penalty?
I have two websites that are extremely similar and want to consolidate them into one website by pointing both domain names at one website. is this going to cause any duplicate content penalties by having two different domain names pointing at the same site? Both domains get traffic so i don't want to just discontinue one of the domains.
Intermediate & Advanced SEO | | Ron100 -
Duplicate page title at bottom of page - ok, or bad?
Can I get you experts opinion? A few years ago, we customized our pages to repeat the page title at the bottom of the page. So the page title is in the breadcrumbs at the top, and then it's also at the bottom of the page under all the contents. Here is a sample page: bit.ly/1pYyrUl I attached a screen shot and highlighted the second occurence of the page title. Am worried that this might be keyword stuffing, or over optimizing? Thoughts or advice on this? Thank you so much! ron ZH8xQX6
Intermediate & Advanced SEO | | yatesandcojewelers0 -
Merge content pages together to get one deep high quality content page - good or not !?
Hi, I manage the SEO of a brand poker website that provide ongoing very good content around specific poker tournaments, but all this content is split into dozens of pages in different sections of the website (blog section, news sections, tournament section, promotion section). It seems like today having one deep piece of content in one page has better chance to get mention / social signals / links and therefore get a higher authority / ranking / traffic than if this content was split into dozens of pages. But the poker website I work for and also many other website do generate naturally good content targeting long tail keywords around a specific topic into different section of the website on an ongoing basis. Do you we need once a while to merge those content pages into one page ? If yes, what technical implementation would you advice ? (copy and readjust/restructure all content into one page + 301 the URL into one). Thanks Jeremy
Intermediate & Advanced SEO | | Tit0 -
WMT Index Status - Possible Duplicate Content
Hi everyone. A little background: I have a website that is 3 years old. For a period of 8 months I was in the top 5 for my main targeted keyword. I seemed to have survived the man eating panda but not so sure about the blood thirsty penguin. Anyway; my homepage, along with other important pages, have been wiped of the face of Google's planet. First I got rid of some links that may not have been helping and disavowed them. When this didn't work I decided to do a complete redesign of my site with better content, cleaner design, removed ads (only had 1) and incorporated social integration. This has had no effect at all. I filed a reconsideration request and was told that I have NOT had any manual spam penalties made against me, by the way I never received any warning messages in WMT. SO, what could be the problem? Maybe it's duplicate content? In WMT the Index Status indicates that there are 260 pages indexed. However; I have only 47 pages in my sitemap and when I do a site: search on Google it only retrieves 44 pages. So what are all these other pages? Before I uploaded the redesign I removed all the current pages from the index and cache using the remove URL tool in WMT. I should mention that I have a blog on Blogger that is linked to a subdomain on my hosting account i.e. http://blog.mydomain.co.uk. Are the blog posts counted as pages on my site or on Blogger's servers? Ahhhh this is too complicated lol Any help will be much appreciated! Many thanks, Mark.
Intermediate & Advanced SEO | | Nortski0 -
Page Indexed but not Cached
A section of pages on my site are indexed (I know because they appear in SERPs if I copy and paste a sentence from the content), however according to the text-only cached version of the page they are not being read by Google.Why are they indexed event hough it seems like Google is not reading them..... or is Google in fact reading this text even though it seems like they should not be?Thanks for your assistance.
Intermediate & Advanced SEO | | theLotter0 -
How to remove duplicate content, which is still indexed, but not linked to anymore?
Dear community A bug in the tool, which we use to create search-engine-friendly URLs (sh404sef) changed our whole URL-structure overnight, and we only noticed after Google already indexed the page. Now, we have a massive duplicate content issue, causing a harsh drop in rankings. Webmaster Tools shows over 1,000 duplicate title tags, so I don't think, Google understands what is going on. <code>Right URL: abc.com/price/sharp-ah-l13-12000-btu.html Wrong URL: abc.com/item/sharp-l-series-ahl13-12000-btu.html (created by mistake)</code> After that, we ... Changed back all URLs to the "Right URLs" Set up a 301-redirect for all "Wrong URLs" a few days later Now, still a massive amount of pages is in the index twice. As we do not link internally to the "Wrong URLs" anymore, I am not sure, if Google will re-crawl them very soon. What can we do to solve this issue and tell Google, that all the "Wrong URLs" now redirect to the "Right URLs"? Best, David
Intermediate & Advanced SEO | | rmvw0