How to fix duplicate content for homepage and index.html
-
Hello,
I know this probably gets asked quite a lot but I haven't found a recent post about this in 2018 on Moz Q&A, so I thought I would check in and see what the best route/solution for this issue might be. I'm always really worried about making any (potentially bad/wrong) changes to the site, as it's my livelihood, so I'm hoping someone can point me in the right direction.
Moz, SEMRush and several other SEO tools are all reporting that I have duplicate content for my homepage and index.html (same identical page).
According to Moz, my homepage (without index.html) has PA 29 and index.html has PA 15. They are both showing Status 200. I read that you can either do a 301 redirect or add rel=canonical
I currently have a 301 setup for my http to https page and don't have any rel=canonical added to the site/page. What is the best and safest way to get rid of duplicate content and merge the my non index and index.html homepages together these days? I read that both 301 and canonical pass on link juice but I don't know what the best route for me is given what I said above.
Thank you for reading, any input is greatly appreciated!
-
OK, Paul, I hear what you are saying. It's a very open and obvious diss.
I'm not sure what you are saying makes any difference to the argument that the canonical way here is not the way to go. I was explaining in the simplest way, I would not want, and I'm sure you would not want either, a live page like this - the home page, live and canonicalised.
(It's a given that the canonical works like a 301, passing link juice to the preferred version.)
So thanks but it makes no difference - delete & 301 every time.
Google is heightening its distrust of canonicals - the new Seach Console tool reveals which pages are the preferred canonical and it's something of a surprise to SEOs!
If you feel like playing top trumps again then why not PM me? - it's so much better and the uninitiated do not need to see it!
Cheers Nigel
-
A proper canonical tag does a lot more than "just be telling Google not to rank it" When used properly (i.e. pages that truly do contain the same content), the canonicalised page passes its ranking signals back to the canonical source.
I agree with Kristina - while a 301 would be preferable (it's a directive, while canonical tags are taken as suggestions), a canonical tag would be vastly better than not doing anything about the issue. At least until the dev can get the problem with the 301-redirect properly resolved.
Paul
-
It's best practice to redirect, but if that's not an option, the canonical route should help the problem a lot! You'll probably lose some link equity with this route, but it should clear up duplicate content issues from Google's side.
-
Hi Dre
If you just do a canonical then the page will still be live, you will just be telling Google not to rank it. Best practice is to remove it all together and 301. It is bad practice having more than one version of your home page, (any page) live!
Regards Nigel
-
Thank you so much for all the responses. So it sounds like 301 redirect through htaccess is the way to go. What is the difference between using the 301 through htaccess vs using rel=canonical in my case? Does the 301 provide better link juice vs rel=canonical or is canonical just not applicable in this case? Thanks for all the replies and helpful suggestions again!
EDIT: I spoke to my developer (who is hosting and maintaining my site now).. he said he tried to do 301 through htaccess but it seems to be crashing the site (and trust me he is very good at what he does). Part of the problem is that my site is VERY old (originally build about 10 years ago and NOT updated once since).. he has been slowly updating and cleaning up the site slowly and he will try to figure out why the 301 is crashing the site and not working but in the mean time how safe is it to use rel=canonical instead of a 301?
Thanks again!
-
Hi dre
Your site really shouldn't be generating an index.html in the first place but if it is you must make sure that there is a 301 in the htaccess file sending all traffic to the single homepage URL as Lynn correctly points out this will be a permanent redirect.
It is very simple to do. Both versions are treated as separate pages (as http and https) so you are essentially showing a duplicate site to Google so your rankings will be terrible until you change.
Regards Nigel
-
Hello there,
You can use .htaccess URL rewrite to remove all the .html from your URL, here's the rewrite rules.
RewriteEngine On
RewriteRule ^index.html$ / [R=301,L]
RewriteRule ^(.*)/index.html$ /$1/ [R=301,L]Once you added this rules you should also fix all your internal links make sure they link to the URL without .html
Hope this helps,
Joseph Yap
-
"I currently have a 301 setup for my http to https page" - great! Also, you should check if your inner pages redirecting from HTTP-versions to HTTPS too.
index.html should redirect to the homepage main version with 301 Permanent Redirect.
-
Google consider HTTP and HTTPS as two separate protocols. Since the contents are same on both versions, google bots consider it as duplicate content. Adding a canonical URL will solve this problem. If you have any doubts, feel free to ask.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexing Issues
One of the main pages on my site, http://www.waikoloavacationrentals.com/kolea-rentals/condos, I have been having a hard time getting google to index it correctly or at all. It is one of the top pages on my site and should be in my sub links in google, but it is not even showing up in searches. Any input would be appreciated. The only red flap issue is the number of outgoing links, but that is the way the page is supposed to be. I would assume most real estate listing pages are very similar. Ultimately when you look at traffic, time on page, inbound links, etc. it is one of the top pages on my site in all those categories. Any input would be greatly appreciated.
On-Page Optimization | | RobDalton0 -
Does this index well
hi i have been looking at this template but as the content will be generated from their database will it index well - am i better to build a static equivalent using something like visual composer as i am not that technical: http://realhomes.inspirythemes.biz/listing/
On-Page Optimization | | neilhenderson0 -
Solve duplicate content issues by using robots.txt
Hi, I have a primary website and beside that I also have some secondary websites with have same contents with primary website. This lead to duplicate content errors. Because of having many URL duplicate contents, so I want to use the robots.txt file to prevent google index the secondary websites to fix the duplicate content issue. Is it ok? Thank for any help!
On-Page Optimization | | JohnHuynh0 -
Meta descriptions better empty or with duplicate content?
I am working with a yahoo store. Somehow all of the meta description fields were filled in with random content from throughout the store. For example, a black cabinet knob product page might have in its description field the specifications for a drawer slide. I don't know how this happened. We have had a programmer auto populate certain fields to get them ready for product feeds, etc. It's possible they screwed something up during that, this was a long time ago. My question. Regardless of how it happened. Is it better for me to have them wipe these fields entirely clean? Or, is it better for me to have them populate the fields with a duplicate of our text from the body. The site has about 6,500 pages so I have and will make custom descriptions for the more important pages after this process, but the workload to do them all is too much. So, nothing or duplicate content for the pages that likely won't receive personal attention?
On-Page Optimization | | dellcos1 -
Duplicate content
crawler shows following links as duplicate http://www.mysite.com http://mysite.com http://www.mysite.com/ http://mysite.com. http://mysite.com/index.html How can i solve this issue?
On-Page Optimization | | bhanu22170 -
Duplicate content on area specific sites
I have created some websites for my company Dor-2-Dor and there is a main website where all of the information across the board is on (www.dor2dor.com) but I also have area specific sites which are for our franchisees who run certain areas around the country (www.swansea.dor2dor.com or www.oxford.dor2dor.com) The problem is that the content that is on a lot of the pages is the same on all of them for instance our faq's page, special offers etc. What is the best way to get these pages to rank well and not have the duplicate content issues and be ranked down by search engines? Any help will be greatly received.
On-Page Optimization | | D2DWeb0 -
Duplicate Title question
Thanks Mozzers in advance for any insight into what I'm sure is a basic SEO question. I'm working with a resort in the great state of Maine. Their home page title reads Maine Resorts, Resorts in Maine, (company name). The site has about 400 URL's and over half of the URL's utilize the first keyword phrase of the home page title, "Maine Resorts." Predominately, I find them used on the Accommodations pages (pages that describe each room with a picture) which I would label as deeper pages and non-conversion type pages. The page titles themselves are not exact duplicates of the Home Page Title but might read something like "Maine Resorts, Company Name, Accommodation Listing." My concern is that the heavy use of "Maine Resorts" as the first phrase in over 200 plus pages might be competing against the home page and pulling the home page ranking down. Thanks for any help given!
On-Page Optimization | | hawkvt10 -
Duplicate Content Question
On the home page of my site I have a read more link that takes you to a different URL with basically the same content, just more of it. Home Page: http://www.opwdecks.com/ Read More Link on Home Page: http://www.opwdecks.com/deckmaintain.htm I think this may be affecting my seo. Any suggestions on what I should do about this? Should I add a canonical to the home page and/or on the other page? Both pages are indexed by google. Thanks for any help or tips.
On-Page Optimization | | opwdecks0