Duplicate Content Resolution Suggestion?
-
SEOmoz tools is saying there is duplicate content for:
What would be the best way to resolve this "error"?
-
Does having the line:
DirectoryIndex index.html
Have any use in addition to the lines you posted?
Thanks.
-
Stephen I agree with the KISS method. Using an htaccess RewriteCond is not the simplest solution for someone who does not know htaccess syntax. In an effort to fully answer this, here is the typical code we are referring to:
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^yoursite.com
RewriteRule (.*) http://www.yoursite.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
RewriteRule ^index.html$ http://www.yoursite.com/ [R=301,L]The first 2 lines are typical commands to Follow Symbolic links, and make sure that the rewrite engine is in the on state.
The first RewriteCond looks at the host, and if it is not the www. version the RewriteRule will redirect the visitor to the http://www. version of your site.
The second RewriteCond looks at whether this is an index.html file, if it is, it will the RewriteRule will 301 redirect them to the version with out the index.html, just yoursite.com/
-
Im a KISS guy, duplicate content pages should just be handled with a rewrite, then they don't appear in any of your stats, attract links, spread your like/tweet numbers over multiple pages, if you are using xml files to keep tabs on your indexing etc and give you a better idea of whats going on on your site
Also, you have to take into account Facebook likes, page tweets, +1s etc - does rel canonical work on the social graph data?
rel canonical sort orders etc but if its a pure duplicate, then 301
"Dont link to page X on your site" isnt really a good solution in my eyes, too much room for error
-
Completely agree.
I think II may have been slightly confused by thinking the default for www.mydomain.com/ was not iindex.html
-
Yes, by the original posting your impression is correct and are the same page, but you can't 301 an index.html page to the domain where the index.html is the page that shows by default.
You could use an htaccess RewriteCond, but could be a little overkill for this situation, where adding a canonical will solve it.
-
I was under the impression that www.mydomain.com/ and www.mydomain.com/index.html were both indexing but are the same page
-
If the index.html is their home page that shows up when just doing the domain: http://www.mydomain.com/
then what would you 301 to? Are you assuming that it is a site that is using a index.php, index.htm, index.asp, etc.?
PlasticCards stated that there is a duplicate content, therefore the index.html page actually exists and should use a canonical.
-
I think in that particular situation I would use 301 as there really isn't a separate use for the /index.html page
-
or 301 redirect in your .htaccess and then you don't have to worry about link issues etc
-
Use a rel='canonical' and use the non index.html for the href;
also don't link to the index.html from anywhere.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Recurring events and duplicate content
Does anyone have tips on how to work in an event system to avoid duplicate content in regards to recurring events? How do I best utilize on-page optimization?
Technical SEO | | megan.helmer0 -
Query Strings causing Duplicate Content
I am working with a client that has multiple locations across the nation, and they recently merged all of the location sites into one site. To allow the lead capture forms to pre-populate the locations, they are using the query string /?location=cityname on every page. EXAMPLE - www.example.com/product www.example.com/product/?location=nashville www.example.com/product/?location=chicago There are thirty locations across the nation, so, every page x 30 is being flagged as duplicate content... at least in the crawl through MOZ. Does using that query string actually cause a duplicate content problem?
Technical SEO | | Rooted1 -
Looking for a technical solution for duplicate content
Hello, Are there any technical solutions to duplicate content similar to the nofollow tag? A tag which can indicate to Google that we know that this is duplicate content but we want it there because it makes sense to the user. Thank you.
Technical SEO | | FusionMediaLimited0 -
Development Website Duplicate Content Issue
Hi, We launched a client's website around 7th January 2013 (http://rollerbannerscheap.co.uk), we originally constructed the website on a development domain (http://dev.rollerbannerscheap.co.uk) which was active for around 6-8 months (the dev site was unblocked from search engines for the first 3-4 months, but then blocked again) before we migrated dev --> live. In late Jan 2013 changed the robots.txt file to allow search engines to index the website. A week later I accidentally logged into the DEV website and also changed the robots.txt file to allow the search engines to index it. This obviously caused a duplicate content issue as both sites were identical. I realised what I had done a couple of days later and blocked the dev site from the search engines with the robots.txt file. Most of the pages from the dev site had been de-indexed from Google apart from 3, the home page (dev.rollerbannerscheap.co.uk, and two blog pages). The live site has 184 pages indexed in Google. So I thought the last 3 dev pages would disappear after a few weeks. I checked back late February and the 3 dev site pages were still indexed in Google. I decided to 301 redirect the dev site to the live site to tell Google to rank the live site and to ignore the dev site content. I also checked the robots.txt file on the dev site and this was blocking search engines too. But still the dev site is being found in Google wherever the live site should be found. When I do find the dev site in Google it displays this; Roller Banners Cheap » admin <cite>dev.rollerbannerscheap.co.uk/</cite><a id="srsl_0" class="pplsrsla" tabindex="0" data-ved="0CEQQ5hkwAA" data-url="http://dev.rollerbannerscheap.co.uk/" data-title="Roller Banners Cheap » admin" data-sli="srsl_0" data-ci="srslc_0" data-vli="srslcl_0" data-slg="webres"></a>A description for this result is not available because of this site's robots.txt – learn more.This is really affecting our clients SEO plan and we can't seem to remove the dev site or rank the live site in Google.Please can anyone help?
Technical SEO | | SO_UK0 -
Affiliate urls and duplicate content
Hi, What is the best way to get around having an affiliate program, and the affiliate links on your site showing as duplicate content?
Technical SEO | | Memoz0 -
Is this considered Duplicate Content?
Good Morning, Just wondering if these pages are considered duplicate content? http://goo.gl/t9lkm http://goo.gl/mtfbf Can you please take a look and advise if it is considered duplicate and if so, what should i do to fix... Thanks
Technical SEO | | Prime850 -
Worpress Tags Duplicate Content
I just fixed a tags duplicate content issue. I have noindexed the tags. Was wondering if anyone has ever fixed this issue and how long did it take you to recover from it? Just kind of want to know for a piece of mind.
Technical SEO | | deaddogdesign0 -
50+ duplicate content pages - Do we remove them all or 301?
We are working on a site that has 50+ pages that all have duplicate content (1 for each state, pretty much). Should we 301 all 50 of the URLs to one URL or should we just completely get rid of all the pages? Are there any steps to take when completely removing pages completely? (submit sitemap to google webmaster tools, etc) thanks!
Technical SEO | | Motava0