Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hiding content until user scrolls - Will Google penalize me?
I've used: "opacity:0;" to hide sections of my content, which are triggered to show (using Javascript) once the user scrolls over these sections. I remember reading a while back that Google essentially ignores content which is hidden from your page (it mentioned they don't index it, so it's close to impossible to rank for it). Is this still the case? Thanks, Sam
Web Design | | Sam.at.Moz0 -
In Wordpress getting marked as duplicate content for tags
Moz is marking 11 high priority items for duplicate content. Just switched to wordpress and publishing articles for the site but only have a few. The problem is on the tag pages. Since there aren't very many articles so when you go to the tag pages it lists one or two articles and hence there are pages with duplicate content. Most of the articles have the same tags / categories. Perhaps I'm using too many tags and categories? I'm using about 7 tags and around 2 categories for each post / event. I've read the solution is using canonical tags but a little confused on which page I should use for the tag and then I believe I need to point the duplicate pages to the correct page. For example, I have two events that are for dances and both have the same tags. So when you visit, site.com/tags/dance or site.com/events both pages have the same articles listed. Which page do I select as having the original content? Does it matter? Does that make sense? Someone was also saying I could use the Yoast plugin to fix, but not really seeing anything in the Yoast tools. I also see 301 redirects mentioned as a solution but the tag pages will be changing as we add new articles and they have a purpose so not really seeing that as a solution.
Web Design | | limited70 -
Any alternative techniques to display tabbed content without using Javascript / JSON and be SEO Friendly?
John Mueller's input in the EGWMH hangout suggests that Google MAY ignore expandable content served by Javascript. Are there any alternative techniques to display tabbed content without using Javascript / JSON and be SEO Friendly? I do however view these as good for website interactivity and UX - and see many examples of websites performing well and ranking highly whilst using these techniques - are there any Google friendly ways to serve content on a page so that search bots can recognise and choose to crawl / consume the content as legitimate fodder?
Web Design | | Fergclaw0 -
Reasons Why Our Website Pages Randomly Loads Without Content
I know this is not a marketing question but this community is very dev savvy so I'm hoping someone can help me. At random times we're finding that our website pages load without the main body content. The header, footer and navigation loads just fine. If you refresh, it's fine but that's not a solution. Happens on Chrome, IE and Firefox, testing with multiple browser versions Happens across various page types - but seems to be only the main content section/container Happens while on the company network, as well as externally Happens after deleting cookies, temporary internet files and restarting computer We are using a CMS that is virtually unheard of - Bridgeline/Iapps Codebase is .net Our IT/Dev group keeps pushing back, blaming it on cookies or Chrome plugins because they apparently are unable to "recreate the problem". This has been going on for months and it's a terrible experience for the user to have. It's also not great when landing PPC visitors on pages that load with no content. If anyone has ideas as to why this may be happening I would really appreciate it. I'm not sure if links are allowed, by today the issue happened on this page serversdirect.com/dm/geek-biz Linking to an image example below knEUzqd
Web Design | | CliqStudios0 -
Is it common to have some of error/warning(currency duplicate,redirect, etc...) in most website that rank well?
Hi could any body could give me some idea on 'on page optimisation' Currently in my campaign I have around 3000+ errors, 14,000+ warning, 7000+ notices for the following reasons: Overly-Dynamic URL
Web Design | | LauraHT
Temporary Redirect
Title Element Too Long (> 70 Characters)
Duplicate Page Title
etc... First of all I know these have negative effect on SEO. Now to fix towards those issues it involve lots of works and times. At the same time most of our important keywords/url rank position have not changed over the last 12 months. Does that mean the above has only limited negative effect? I just want to know is it worthy to invest the man/hour/money to clean those issues. As it involves decent development time. Is it common to have some of error/warning in most website that rank well? (e.g. I 've seem may big website have duplicate title/meta-desc on their currency variant page)0 -
Google also indexed trailing slash version - PLEASE HELP
Hi Guys, We redesigned the website and somehow our canonical extension decided to add a trailing slash to all URLs. Previously our canonical URLs didn't have a trailing slash. During the redesign we haven't changed the URLs. They remained same but we have now two versions indexed. One with trailing slash one without. I've now fixed the issue and removed the the trailing slash from canonical URLs. Is this the correct way of fixing it? Will our rankings be effected in a negative way? Is there anything else I need to do. The website went live last Tuesday. Thanks
Web Design | | Jvalops0 -
Duplicate content on mobile sites
Hi Guys We are launching a mobile webshop later this year and have decided to use a subdomain for this. (m.domainname.xx). The content will be more or less identical with the one on the standard desktop site (domainname.xx), but im struggeling to find out if this will create dipplicate content between the mobile and desktop site. Does anyone have a solid answer for this one?
Web Design | | AndersDK0 -
Content position on page
I am in a limo service industry where people are not looking for great content or product description, all they want is a nice Lincoln Town car and a competitive price. Because I need to get more pictures in front of my customers rather than more content I am not sure if by not having the content high up in the page will affect my rankings. We are transitioning to a new template where we have more control over the layout of the website but because of the slider that we have on the homepage the content needs to go further down. We could insert some content in each of the slides but the page would start looking too "busy". We want the customers to see very clearly what we offer. They see the picture, click for more info and book the service. How important still is to have your keywords in the first hundred words on a certain webpage? Can we get away with having the content read by search engines after 3 - 4 slides and their description (about 20 words total) ?
Web Design | | echo10