Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why Is Google Showing My Images Upside Down in the Index?
Hi, My client has PDFs of their catalog on the site which google is indexing. However, it seems that google is taking an image from the catalog and then showing it upside in the index for images/search results. The images are not upside down on the site. Has anyone heard of this happening before or does anyone know a way to fix it? Thanks
Web Design | | AliMac260 -
Spanish website indexed in English, redirect to spanish or english version if i do a new website design?
Hi MOZ users, i have this problem. We have a website in Spanish Language but Google crawls it on English (it is not important the reasons). We re made the entire website and now we are planning the move. The new website will have different language versions, english, spanish and portuguese. Somebody tells me that we have to redirect the old urls (crawled on english) to the new english versions, not to the spanish (the real language of the firsts). Example: URL1 Language: Spanish - Crawled on English --> redirect to Language English version. the other option will be redirect to the spanish new version, which the visitor is waiting to find. URL1 Language: Spanish - Crawled on English --> redirect to Language Spanish version. What do you think? Which is the better option?
Web Design | | NachoRetta0 -
Fixing Render Blocking Javascript and CSS in the Above-the-fold content
We don't have a responsive design site yet, and our mobile site is built through Dudamobile. I know it's not the best, but I'm trying to do whatever we can until we get around to redesigning it. Is there anything I can do about the following Page Speed Insight errors or are they just a function of using Dudamobile? Eliminate render-blocking JavaScript and CSS in above-the-fold content Your page has 3 blocking script resources and 5 blocking CSS resources. This causes a delay in rendering your page.None of the above-the-fold content on your page could be rendered without waiting for the following resources to load. Try to defer or asynchronously load blocking resources, or inline the critical portions of those resources directly in the HTML.Remove render-blocking JavaScript: http://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js http://mobile.dudamobile.com/…ckage.min.js?version=2015-04-02T13:36:04 http://mobile.dudamobile.com/…pts/blogs.js?version=2015-04-02T13:36:04 Optimize CSS Delivery of the following: http://fonts.googleapis.com/…:400|Great+Vibes|Signika:400,300,600,700 http://mobile.dudamobile.com/…ont-pack.css?version=2015-04-02T13:36:04 http://mobile.dudamobile.com/…kage.min.css?version=2015-04-02T13:36:04 http://irp-cdn.multiscreensite.com/kempruge/files/kempruge_0.min.css?v=6 http://irp-cdn.multiscreensite.com/…mpruge/files/kempruge_home_0.min.css?v=6 Thanks for any tips, Ruben
Web Design | | KempRugeLawGroup0 -
Should Blog Category Archive URLs be Set to "No-Index" in Wordpress?
It appears that Google Webmaster Tools is listing about 120 blog archives URLs in Google Index>Index Status that should not be listed. Our site map contains 650 pages, but Google shows 860. Pages like: <colgroup><col width="464"></colgroup>
Web Design | | Kingalan1
| http://www.nyc-officespace-leader.com/blog/category/manhattan-office-space | With Titles Like: <colgroup><col width="454"></colgroup>
| Manhattan Office Space Archives - Metro Manhattan Office Space | Are listed when in the Rogerbot crawl report for the site. How can we remove such pages from Google Webmaster Tools, Index Status? Our site map shows about 650 pages, yet Google show these extra pages. We would prefer that they not be indexed. Note that these pages do not appear when we run a site:www.nyc-officespace-leader.com search. The site has suffered a drop in ranking since May and we feel it prudent to keep Google from indexing useless URLs. Before May 650 pages showed on the Webmaster Tools Index status, and suddenly in early June when we upgraded the site the index grew by about 175 pages. I suspect the 120 blog archives URLs may have something to do with it. How can we get them removed? Can we set them to "No-Index", or should the robot text be used to remove them? Or can some type of removal request be made to Google? My developers have been struggling with this issue since early June. The bloat on the site is about 175 URLs not on the site map. Is there any go to authority on this issue (it is apparently rather complicated) that can provide a definitive answer? Thanks!!
Alan0 -
Will a .com and .co.uk site (with exact same content) hurt seo
hello, i am sure this question has been asked before, but while i tried to search i could not find the right answer. my question is i have a .com and .co.uk site. both sites have exact same product, exact same product descriptions, and everything is the same. the reason for 2 sites is that .com site shows all the details for US customers and in $, and .co.uk site shows all the details to UK customers and with Pound signs. the only difference in the 2 sites might be the privacy policy (different for US and UK) and different membership groups the site belongs to (US site belong to a list of US trade groups, UK belongs to a list of UK trade groups). my question is other than the minor difference above, all the content of the site is exactly the same, so will this hurt seo for either one or both the site. Our US site much more popular and indexed already in google for 4 years, while our UK site was just started 1 month ago. (also both the sites are hosted by same hosting company, with one site as main domain and the other site as domain addon (i thought i include this information also, if it makes sense to readers)) i would appreciate a reply to the question above thanks
Web Design | | kannu10 -
Does hidden content in jQuery ui tabs still get ignored?
I am looking for a more current answer to this question. I know that google leaves out the js and css. But since the code usually has display:hidden inline with the code while using jquery ui tabs I was curious to know if google considers this hidden or from what some articles have said, "tries to ignore the content". Is this still true today? I would assume no but looking for some back-up.
Web Design | | sknott0 -
Google Bot cannot see the content of my pages
When I go to Google Webmaster tools and I type in any URL from the site http://www.ccisolutions.com in the "Fetch as Google Bot" feature, and then I click the link that says "success," Google bot is seeing my pages like this: <code>HTTP/1.1 200 OK Date: Tue, 26 Apr 2011 19:11:50 GMT Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a DAV/2 PHP/5.2.4 mod_jk/1.2.25 Set-Cookie: CCISolutions-UT-Status=66.249.72.55.1303845110495128; path=/; expires=Thu, 25-Apr-13 19:11:50 GMT; domain=.ccisolutions.com Last-Modified: Tue, 28 Oct 2008 14:36:45 GMT ETag: "314b26-5a-2d421940" Accept-Ranges: bytes Content-Length: 90 Keep-Alive: timeout=15, max=99 Connection: Keep-Alive Content-Type: text/html Any clue as to why this could be happening?</code>
Web Design | | danatanseo0 -
Lazy Loading Content and SEO
I'v been seeing a lot of websites use a technique to present content to website visitors when the scroll down the page called "Lazy Loading". Does this hinder SEO and indexing since the content is not actually on the page until the user acts/requests it?
Web Design | | JusinDuff0