Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mobile-First Indexing New Site monetized with Adsense AMP or not?
I am considering developing a new site monetized with Adsense. I am wondering if it's still worth bothering with AMP, it will take some work to get the functionality I have in mind working on these pages due to the inherent limitations. Has anyone got any insights in terms of current and future benefits of AMP in terms of ranking benefits and Adsense earning potential?
Web Design | | GrouchyKids0 -
Sitemap Question (aspx, XML, HTML)
Hey everyone! My company uses a tool called SEOQuake. We are trying to hit all of their "checkmarks" when we run a diagnosis for them. One of the only things we can not figure out how to pass is their section for Site Compliance ---> XML Sitemaps. Our client's websites that we have built are all using .aspx URL structures, and when I view them, it clearly states that it is an XML file. It has this text written at the top of the .aspx page: "This XML file does not appear to have any style information associated with it. The document tree is shown below." Does anyone know what is happening here?
Web Design | | TaylorRHawkins
Thank you!1 -
Google text-only vs rendered (index and ranking)
Hello, can someone please help answer a question about missing elements from Google's text-only cached version.
Web Design | | cpawsgo
When using JavaScript to display an element which is initially styled with display:none, does Google index (and most importantly properly rank) the elements contents? Using Google's "cache:" prefix followed by our pages url we can see the rendered cached page. The contents of the element in question are viewable and you can read the information inside. However, if you click the "Text-only version" link on the top-right of Google’s cached page, the element is missing and cannot be seen. The reason for this is because the element is initially styled with display:none and then JavaScript is used to display the text once some logic is applied. Doing a long-tail Google search for a few sentences from inside the element does find the page in the results, but I am not certain that is it being cached and ranked optimally... would updating the logic so that all the contents are not made visible by JavaScript improve our ranking or can we assume that since Google does return the page in its results that everything is proper? Thank you!0 -
ECWID How to fix Duplicate page content and external link issue
I am working on a site that has a HUGE number of duplicate pages due to ECWID ecommerce platform. The site is built with Joomla! How can I rectify this situation? The pages also show up as "external " links on crawls... Is it the ECWID platform? I have never worked on a site that uses this. Here is an example of a page with the issue (there are 6280 issues) URL: http://www.metroboltmi.com/shop-spare-parts?Itemid=218&option=com_rokecwid&view=ecwid&ecwid_category_id=3560081
Web Design | | Atlanta-SMO0 -
Crawl Diagnostics Summary - Duplicate Content
Hello SEO Experts, I am a developer at www.bowanddrape.com and we are working on improving the SEO of the website. The SEOMoz Crawl Diagnostics Summary shows that following 2 URL have duplicate content. http://www.bowanddrape.com/clothing/Tan+Accessories+Calfskin+Belt/50_5142 http://www.bowanddrape.com/clothing/Black+Accessories+Calfskin+Belt/50_5143 Can you please suggest me ways to fix this problem? Is the duplicate content error because of same "The Details", "Size Chart" and "The Silhouette" and "You may also like" ? Thanks, Chirag
Web Design | | ChiragNirmal0 -
Duplicate home page /index.asp /index.php etc
We recently moved www.devoted2vntage.co.uk to shopify but seem to have multiple home page variants still in google index. I am concerned that these will be causing duplicate content. I have redirected the offending URLs below to www.devoted2vintage.co.uk/ and have set up a canonical URL but need an expect to tell me if I have taken the current steps and if not, exactly what I need to do. www.devoted2vintage.co.uk/index.php www.devoted2vintage.co.uk/index.htm www.devoted2vintage.co.uk/index.html www.devoted2vintage.co.uk/index.shtml www.devoted2vintage.co.uk/index.aspx www.devoted2vintage.co.uk/index.cfm www.devoted2vintage.co.uk/index.pl www.devoted2vintage.co.uk/index.asp
Web Design | | devoted2vintage0 -
Next Google Index..?
Hi Guys, Does anybody have an idea when the next Google index is due roughly and if there is anyway I can tell approx when these are due to happen and how would I know? Thanks In advance, Craig Fenton IT
Web Design | | craigyboy0 -
How much content is too much? Best Pages For Content?
To my understanding content has a lot to do with organic rankings if written correctly. My question is, how much content is too much and what pages are best to place content. Our company sells very costly products. Our customers call to purchase, we do not have an eCommerce site. Write now we have on average 350 words per page. We have about 200+ pages. Each page is written for that general category and each product has its own unique content. It seems to me that the pages with less content, tend to rank a bit better. As we are in the process of redoing our website, is there any recommendations on writing content, or adjusting the amount of text. I am thinking a lot of our text is informative only to a certain extent. Would writing content just for the main category page be better, and then on the actual product page, have only about 250 words as a description? Are there any other recommendations for SEO that are fairly new? Besides the Title, Description, Heading Tags, Image Alts, URLS etc.
Web Design | | hfranz0