How to resolve Duplicate Page Content issue for root domain & index.html?
-
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
-
great code Josh...but , after i saved it on .htaccess , a "?" appeared on the link..
http://www.domain.com/?/example/file.html
Is this ok ? pls advice/
Thank you,
-
You touched on a good point here "We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps."
Too often I see sites where they get the home page right but miss the re-write on the directories.
-
Here's the .htaccess rewrite command that you can use for the index.html redirect -
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.amarasoftware.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.amarasoftware.com/$1/ [R=301,L]
We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps.
-
I'd check it with some other software too... i.e. Raven Tools free trial or something, that will tell you if there's canonicalization problems... of course I'm not advocating Raven Tools over SEOmoz tools (I'm a member here and not there for good reasons), I just think best to try a few different tests before deciding if it's a problem. There might just be an issue with the SEOmoz campaign tool for the moment, which I'm sure they'll fix as soon as they realise.
Hey, aren't you the tutor I had in my SEC usability course?
-
Unfortunately I can't speak for how SEOmoz handles rewrites like this if it's already crawled the page.
The rewrite rule you're using looks like it's only rewriting the www portion of the URL, not index.html. So alone it wouldn't do anything to solve dupe content issues. (someone please correct me if I'm misreading the rewrite rule)
Here's a link to what I used to write a redirect for index.html on another site.
http://www.webmasterworld.com/forum92/6375.htm
I think it is a fairly safe assumption to make that SEOmoz is smart enough to realize if you're got a redirect in there (providing that its working). I'd still recommend taking a look to see if Google has cached or indexed an index.html version, though.
Edit: my personal, highly technical, acid-test for an index.html redirect is just going there and manually entering the url with index.html on the end, rather than waiting for a recrawl to see if you're heading in the right direction.
-
RewriteEngine on RewriteCond %{HTTP_HOST} ^([a-z.]+)?amarasoftware.com$ [NC] RewriteCond %{HTTP_HOST} !^www. [NC] RewriteRule .? http://www.%1amarasoftware.com%{REQUEST_URI} [R=301,L] Is what I use. In Seomoz this leads to www.amarasoftware.com and index.html so 2 different URL's, both with different incoming links, and a different authority, which has an impact on my ranking if correct. in SEomoz this a returns a duplicate title and meta tags errors. If SEOmoz finds 2 pages instead of one I may assume that Google agrees with this.
-
As you did, I'd normally handle this with a 301 from index.html to the root domain. When you say that it's "not had an impact" do you mean that the SEOmoz dashboard continues to show an error after it re-crawls, or that the search engines are not picking up the redirect?
SEOmoz dashboard does a great job, but I'd check to see how the search engines are actually indexing yourdomain.com/index.html vs. yourdomain.com also. If the search engines are indexing it as you want them to, then I'd be inclined to ignore the dashboard error.
I apologize if this is a stupid question, but I assume you manually checked that the redirect worked?
-
You wish to canonicalize the pages. That is the SEO word which describes exactly what you are trying to achieve.
Above are 5 URLs which can possibly lead to the exact same page. If you add the following HTML in the code then the pages will be canonicalized.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I've got duplicate pages. For example, blog/page/2 is the same as author/admin/page/2\. Is this something I should just ignore, or should I create the author/admin/page2 and then 301 redirect?
I'm going through the crawl report and it says I've got duplicate pages. For example, blog/page/2 is the same as author/admin/page/2/ Now, the author/admin/page/2 I can't even find in WordPress, but it is the same thing as blog/page/2 nonetheless. Is this something I should just ignore, or should I create the author/admin/page2 and then 301 redirect it to blog/page/2?
Intermediate & Advanced SEO | | shift-inc0 -
HELP! How does one prevent regional pages as being counted as "duplicate content," "duplicate meta descriptions," et cetera...?
The organization I am working with has multiple versions of its website geared towards the different regions. US - http://www.orionhealth.com/ CA - http://www.orionhealth.com/ca/ DE - http://www.orionhealth.com/de/ UK - http://www.orionhealth.com/uk/ AU - http://www.orionhealth.com/au/ NZ - http://www.orionhealth.com/nz/ Some of these sites have very similar pages which are registering as duplicate content, meta descriptions and titles. Two examples are: http://www.orionhealth.com/terms-and-conditions http://www.orionhealth.com/uk/terms-and-conditions Now even though the content is the same, the navigation is different since each region has different product options / services, so a redirect won't work since the navigation on the main US site is different from the navigation for the UK site. A rel=canonical seems like a viable option, but (correct me if I'm wrong) it tells search engines to only index the main page, in this case, it would be the US version, but I still want the UK site to appear to search engines. So what is the proper way of treating similar pages accross different regional directories? Any insight would be GREATLY appreciated! Thank you!
Intermediate & Advanced SEO | | Scratch_MM0 -
Dealing with close content - duplicate issue for closed products
Hello I'm dealing with some issues. Moz analyses is telling me that I have duplicate on some of my products pages. My issue is that: Concern very similar products IT products are from the same range Just the name and pdf are different Do you think I should use canonical url ? Or it will be better to rewrite about 80 descriptions (but description will be almost the same) ? Best regards.
Intermediate & Advanced SEO | | AymanH0 -
Redirecting thin content city pages to the state page, 404s or 301s?
I have a large number of thin content city-level pages (possibly 20,000+) that I recently removed from a site. Currently, I have it set up to send a 404 header when any of these removed city-level pages are accessed. But I'm not sending the visitor (or search engine) to a site-wide 404 page. Instead, I'm using PHP to redirect the visitor to the corresponding state-level page for that removed city-level page. Something like: if (this city page should be removed) { header("HTTP/1.0 404 Not Found");
Intermediate & Advanced SEO | | rriot
header("Location:http://example.com/state-level-page")
exit();
} Is it problematic to send a 404 header and still redirect to a category-level page like this? By doing this, I'm sending any visitors to removed pages to the next most relevant page. Does it make more sense to 301 all the removed city-level pages to the state-level page? Also, these removed city-level pages collectively have very little to none inbound links from other sites. I suspect that any inbound links to these removed pages are from low quality scraper-type sites anyway. Thanks in advance!2 -
Duplicate Content
Hi, So I have my great content (that contains a link to our site) that I want to distribute to high quality relevant sites in my niche as part of a link building campaign. Can I distribute this to lots of sites? The reason I ask is that those sites will then have duplicate content to all the other sites I distribute the content to won;t they? I this duplication bad for them and\or us? Thanks
Intermediate & Advanced SEO | | Studio330 -
Does duplicate content penalize the whole site or just the pages affected?
I am trying to assess the impact of duplicate content on our e-commerce site and I need to know if the duplicate content is affecting only the pages that contain the dupe content or does it affect the whole site? In Google that is. But of course. Lol
Intermediate & Advanced SEO | | bjs20100 -
Dropped ranking - Penguin penalty or duplicate content issue?
Just this weekend a page that had been ranking well for a competitive term fell completely out of the rankings. There are two possible causes and I'm trying to figure out which it is, so I can take action. I found out that I had accidentally put a canonical on another page that was for the same page as the one that dropped out of the rankings. If there are two pages with the same canonical tag with different content, will google drop both of them from the index? The other possibility is that this is a result of the recent Penguin update. The page that dropped has a high amount of exact anchor text. As far as I can tell, there were no other pages with any penalties from the Penguin update. One last question: The page completely dropped from the search index. If this were a Penguin issue, would it have dropped out completely,or just been penalized with a drop in position? If this is a result of the conflicting canonical tags, should I just wait for it to reindex, or should I request a reconsideration of the page?
Intermediate & Advanced SEO | | gametv0 -
Duplicate page Content
There has been over 300 pages on our clients site with duplicate page content. Before we embark on a programming solution to this with canonical tags, our developers are requesting the list of originating sites/links/sources for these odd URLs. How can we find a list of the originating URLs? If you we can provide a list of originating sources, that would be helpful. For example, our the following pages are showing (as a sample) as duplicate content: www.crittenton.com/Video/View.aspx?id=87&VideoID=11 www.crittenton.com/Video/View.aspx?id=87&VideoID=12 www.crittenton.com/Video/View.aspx?id=87&VideoID=15 www.crittenton.com/Video/View.aspx?id=87&VideoID=2 "How did you get all those duplicate urls? I have tried to google the "contact us", "news", "video" pages. I didn't get all those duplicate pages. The page id=87 on the most of the duplicate pages are not supposed to be there. I was wondering how the visitors got to all those duplicate pages. Please advise." Note, the CMS does not create this type of hybrid URLs. We are as curious as you as to where/why/how these are being created. Thanks.
Intermediate & Advanced SEO | | dlemieux0