How to fix duplicate content for homepage and index.html
-
Hello,
I know this probably gets asked quite a lot but I haven't found a recent post about this in 2018 on Moz Q&A, so I thought I would check in and see what the best route/solution for this issue might be. I'm always really worried about making any (potentially bad/wrong) changes to the site, as it's my livelihood, so I'm hoping someone can point me in the right direction.
Moz, SEMRush and several other SEO tools are all reporting that I have duplicate content for my homepage and index.html (same identical page).
According to Moz, my homepage (without index.html) has PA 29 and index.html has PA 15. They are both showing Status 200. I read that you can either do a 301 redirect or add rel=canonical
I currently have a 301 setup for my http to https page and don't have any rel=canonical added to the site/page. What is the best and safest way to get rid of duplicate content and merge the my non index and index.html homepages together these days? I read that both 301 and canonical pass on link juice but I don't know what the best route for me is given what I said above.
Thank you for reading, any input is greatly appreciated!
-
OK, Paul, I hear what you are saying. It's a very open and obvious diss.
I'm not sure what you are saying makes any difference to the argument that the canonical way here is not the way to go. I was explaining in the simplest way, I would not want, and I'm sure you would not want either, a live page like this - the home page, live and canonicalised.
(It's a given that the canonical works like a 301, passing link juice to the preferred version.)
So thanks but it makes no difference - delete & 301 every time.
Google is heightening its distrust of canonicals - the new Seach Console tool reveals which pages are the preferred canonical and it's something of a surprise to SEOs!
If you feel like playing top trumps again then why not PM me? - it's so much better and the uninitiated do not need to see it!
Cheers Nigel
-
A proper canonical tag does a lot more than "just be telling Google not to rank it" When used properly (i.e. pages that truly do contain the same content), the canonicalised page passes its ranking signals back to the canonical source.
I agree with Kristina - while a 301 would be preferable (it's a directive, while canonical tags are taken as suggestions), a canonical tag would be vastly better than not doing anything about the issue. At least until the dev can get the problem with the 301-redirect properly resolved.
Paul
-
It's best practice to redirect, but if that's not an option, the canonical route should help the problem a lot! You'll probably lose some link equity with this route, but it should clear up duplicate content issues from Google's side.
-
Hi Dre
If you just do a canonical then the page will still be live, you will just be telling Google not to rank it. Best practice is to remove it all together and 301. It is bad practice having more than one version of your home page, (any page) live!
Regards Nigel
-
Thank you so much for all the responses. So it sounds like 301 redirect through htaccess is the way to go. What is the difference between using the 301 through htaccess vs using rel=canonical in my case? Does the 301 provide better link juice vs rel=canonical or is canonical just not applicable in this case? Thanks for all the replies and helpful suggestions again!
EDIT: I spoke to my developer (who is hosting and maintaining my site now).. he said he tried to do 301 through htaccess but it seems to be crashing the site (and trust me he is very good at what he does). Part of the problem is that my site is VERY old (originally build about 10 years ago and NOT updated once since).. he has been slowly updating and cleaning up the site slowly and he will try to figure out why the 301 is crashing the site and not working but in the mean time how safe is it to use rel=canonical instead of a 301?
Thanks again!
-
Hi dre
Your site really shouldn't be generating an index.html in the first place but if it is you must make sure that there is a 301 in the htaccess file sending all traffic to the single homepage URL as Lynn correctly points out this will be a permanent redirect.
It is very simple to do. Both versions are treated as separate pages (as http and https) so you are essentially showing a duplicate site to Google so your rankings will be terrible until you change.
Regards Nigel
-
Hello there,
You can use .htaccess URL rewrite to remove all the .html from your URL, here's the rewrite rules.
RewriteEngine On
RewriteRule ^index.html$ / [R=301,L]
RewriteRule ^(.*)/index.html$ /$1/ [R=301,L]Once you added this rules you should also fix all your internal links make sure they link to the URL without .html
Hope this helps,
Joseph Yap
-
"I currently have a 301 setup for my http to https page" - great! Also, you should check if your inner pages redirecting from HTTP-versions to HTTPS too.
index.html should redirect to the homepage main version with 301 Permanent Redirect.
-
Google consider HTTP and HTTPS as two separate protocols. Since the contents are same on both versions, google bots consider it as duplicate content. Adding a canonical URL will solve this problem. If you have any doubts, feel free to ask.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to organise subpages for good SEO content without duplicate text?
We are working on many subpages for our services. We have original content for each page however there are few text which we need to always duplicate like: Contact sales window, why to choose us window, supported files etc. What's the best way to do this so it's not consider as duplicated text. Should we redirected it or add it as a picture and always change name of the picture? Thank you Lukas
On-Page Optimization | | Lukas-ST0 -
What should I title my homepage tab?
Our homepage tab just reads "home." Am I missing out on something with that? Should I re-write the tab to our brand name or a general descriptor? Best, Ruben
On-Page Optimization | | KempRugeLawGroup0 -
Duplicate Content, Same Company?
Hello Moz Community, I am doing work for a company and they have multiple locations. For example, examplenewyork.com, examplesanfrancisco.com, etc. They also have the same content on certain pages within each website. For example, examplenewyork.com/page-a has the same content as examplesanfrancisco.com/page-a Does this duplicate content negatively impact us? Or could we rank for each page within each location parameter (for example, people in new york search page-a would see our web page and people in san fran search page-a would see our web page)? I hope this is clear. Thanks, Cole
On-Page Optimization | | ColeLusby0 -
Duplicate and thin content - advanced..
Hi Guys Two issues to sort out.. So we have a website that lists products and has many pages for: a) The list pages - that lists all the products for that area.
On-Page Optimization | | nick-name123
b) The detailed pages - that when click into from the list page, will list the specific product in full. On the list page, we perhaps have half the description written down, when clicked into you see the full description.
If you search in google for a phrase on the detailed page, you will see results for that specific page including 'multiple' list pages where it is on. For example, lets say we are promoting 'trees' which are situated in Manhatten. And we are also promoting trees in Brooklyn, there is a crossover. So a tree listed in Manhatten will also be listen in brooklyn as its close by (not from America so don't laugh if I have areas muddled)
We then have quite a few pages with the same content as a result. I read a post a while back from the mighty Cutts who said not to worry about the duplicate unless its spammy, but what is good for one person, is spammy to another.. Does anyone have any ideas as to if this is a genuine problem and how you would solve? Also, we know we have alot of thin content on the site, but we dont know how to identify it. It's a large site so needs something automated (I think).. Thanks in advance Nick0 -
Google indexing
Hi In my site I have 2 blogs, the first blog is a standard blog, every post is informative and over 6oo words with pictures and all of them are keyworded. The second blog is basically a journal of bike rides i go on, with a picture and about 100 - 300 word writeup. I use a portfolio plugin to get this online. My question is should I noindex nofollow all of these posts. Im not sure if google will see it as a lot of uninformative noncene, I dont write these as blog posts they are a journal I post 1 or 2 a day. What is the normal practice for this... they are not keyworded or seo'd I dont want them to affect my seo or rankings. Thanks Chris
On-Page Optimization | | mrcsleonard0 -
Our sitemap is not indexed well
Hey there, Hope you guys can help. We get the following error: Nested indexing. Another Sitemap index refers to the index of sitemaps. The thing is that we cant find the error they are talking about. Thanks!!!!
On-Page Optimization | | Comunicare0 -
Article on site and distribution, is it duplicate content?
I was always taught to place all original articles on site, let them get indexed by Google, then put out for distribution through various press release outlets. With the latest penguin update, how does this practice work out concerning duplicate content? In theory, I wrote the article so I should get credit for it on my site first, then push through various distribution outlets to get it out to my targeted audience in my niche field. Typing out loud I would tend to think if the article is on my site first then I would get credit and any others following would be hit by duplicate content if in fact google considered it a dupe violation. Any input on this? Am I on track or am I heading for a train wreck.
On-Page Optimization | | anthonytjm0 -
Duplicate Content
Hi I have Duplicate content that i do sent understand 1 - www.example.dk 2- www.example.dk/ I thought i was the same page, whit and without the / Hope someone can help 🙂
On-Page Optimization | | seopeter290