Duplicate content issue index.html vs non index.html
-
Hi
I have an issue. In my client's profile, I found that the "index.html" are mostly authoritative than non "index.html", and I found that www. version is more authoritative than non www. The problem is that I find the opposite situation where non "index.html" are more authoritative than "index.html" or non www more authoritative than www.
My logic would tell me to still redirect the non"index.html" to "index.html". Am I right?
and in the case I find the opposite happening, does it matter if I still redirect the non"index.html" to "index.html"?
The same question for www vs non www versions?
Thank you
-
Yes, I like using rewrites in an .htaccess file, which is covered in the links above.
-
I fix the 2 URLs.
In this case domain.com/index.html is the code for domain.com/.
Do you mean to use mode_rewrite and create a 301 redirect from domain.com/index.html to domain.com/ ?
Thank you for your time.
-
<colgroup><col span="30" width="64"></colgroup>
Hi Taysir, first of all ypou must take an overview with what is duplicate content? Solving the cannonical problems with www. Duplicate Content Issues in www & non www I hope that your query had been solved. -
It's very likely that the "index.html" version is more authoritative because you're using it in internal links. The problem is that that often creates a duplication issue - you refer to the root (non-index.html) version in inbound links, social, etc. (and people tend to link and bookmark the root version), but then link internally to "index.html", so Google will end up indexing both.
If the authority is coming from internal links, and you:
(1) Switch the internal links to the root ("/")
(2) 301-redirect "index.html" to the root ("/")
...you shouldn't lose any authority, as you'll have re-routed it by doing step (1). You'll also consolidate your signals and be better off all-around, IMO.
Kane's right, though - it's a bit tough to tell without knowing the specifics.
-
Redirecting the authoritative link to the less authoritative URL is not ideal.
However, in my opinion being consistent with URLs throughout the site takes precedent.
Implementing 301 redirects will indicate that there has been a permanent relocation of that pages content, and you will get most of the link value from the authoritative link. That said, if you feel comfortable emailing the person who created that authoritative link, it's worth a little effort to ask them to change it, but if it's a hassle to do so, don't push it.
-
How to redirect domain.com/index.html to domain.com/index.html?
Those two URLs are the same, so there is nothing to change. If you wanted to redirect domain.com/index.html to domain.com/ then you would do so with 301 redirects. Here's a guide on getting started:
http://www.seomoz.org/learn-seo/redirection
http://www.seomoz.org/blog/url-rewrites-and-301-redirects-how-does-it-all-work
-
I personally would rewrite & redirect everything using the 2nd option above.
Can you explain me how to do that, please?
How to redirect domain.com/index.html to domain.com?
Thanks
-
thank you for your detailed answer but one more thing does it matter if I redirect a more authoritative link to a weaker one for the benefit of staying consistent and vice versa?
let s say I redirect a non index.html to an index.html and vice versa for the sake of consistency?
-
You should stick with one format across the site:
-
domain.com/index.html and domain.com/subfolder/index.html
**OR **
I typically choose the second option because it is agnostic of CMS or file type, and it looks better in my opinion. I would not mix the two across the site because it causes a confusing user experience.
So, to answer your questions directly:
My logic would tell me to still redirect the non"index.html" to "index.html". Am I right?
No, not necessarily. By telling us that there are examples where .html is more authoritative and there are examples where it isn't as authoritative, it's impossible for us to say which is the better choice. I personally would rewrite & redirect everything using the 2nd option above.
**The same question for www vs non www versions? **
I believe that WWW vs non-WWW is less important. You could decide based upon which format has more links or which one has been historically used. Consistency (using the same across the entire site), proper 301 redirects, and proper rel canonical tags are your priorities here.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
150+ Pages of URL Parameters - Mass Duplicate Content Issue?
Hi we run a large e-commerce site and while doing some checking through GWT we came across these URL parameters and are now wondering if we have a duplicate content issue. If so, we are wodnering what is the best way to fix them, is this a task with GWT or a Rel:Canonical task? Many of the urls are driven from the filters in our category pages and are coming up like this: page04%3Fpage04%3Fpage04%3Fpage04%3F (See the image for more). Does anyone know if these links are duplicate content and if so how should we handle them? Richard I7SKvHS
Technical SEO | | Richard-Kitmondo0 -
Partially duplicated content on separate pages
TL;DR: I am writing copy for some web pages. I am duplicating some bits of copy exactly on separate web pages. And in other cases I am using the same bits of copy with slight alterations. Is this bad for SEO? Details: We sell about 10 different courses. Each has a separate page. I'm currently writing copy for those pages. Some of the details identical for each course. So I can duplicate the content and it will be 100% applicable. For example, when we talk about where we can run courses (we go to a company and run it on their premises) – that's applicable to every course. Other bits are applicable with minor alterations. So where we talk about how we'll tailor the course, I will say for example: "We will the tailor the course to the {technical documents|customer letters|reports} your company writes." Or where we have testimonials, the headline reads "Improving {customer writing|reports|technical documents} in every sector and industry". There is original content on each page. The duplicate stuff may seem spammy, but the alternative is me finding alternative re-wordings for exactly the same information. This is tedious and time-consuming and bizarre given that the user won't notice any difference. Do I need to go ahead and re-write these bits ten slightly different ways anyway?
Technical SEO | | JacobFunnell0 -
Development Website Duplicate Content Issue
Hi, We launched a client's website around 7th January 2013 (http://rollerbannerscheap.co.uk), we originally constructed the website on a development domain (http://dev.rollerbannerscheap.co.uk) which was active for around 6-8 months (the dev site was unblocked from search engines for the first 3-4 months, but then blocked again) before we migrated dev --> live. In late Jan 2013 changed the robots.txt file to allow search engines to index the website. A week later I accidentally logged into the DEV website and also changed the robots.txt file to allow the search engines to index it. This obviously caused a duplicate content issue as both sites were identical. I realised what I had done a couple of days later and blocked the dev site from the search engines with the robots.txt file. Most of the pages from the dev site had been de-indexed from Google apart from 3, the home page (dev.rollerbannerscheap.co.uk, and two blog pages). The live site has 184 pages indexed in Google. So I thought the last 3 dev pages would disappear after a few weeks. I checked back late February and the 3 dev site pages were still indexed in Google. I decided to 301 redirect the dev site to the live site to tell Google to rank the live site and to ignore the dev site content. I also checked the robots.txt file on the dev site and this was blocking search engines too. But still the dev site is being found in Google wherever the live site should be found. When I do find the dev site in Google it displays this; Roller Banners Cheap » admin <cite>dev.rollerbannerscheap.co.uk/</cite><a id="srsl_0" class="pplsrsla" tabindex="0" data-ved="0CEQQ5hkwAA" data-url="http://dev.rollerbannerscheap.co.uk/" data-title="Roller Banners Cheap » admin" data-sli="srsl_0" data-ci="srslc_0" data-vli="srslcl_0" data-slg="webres"></a>A description for this result is not available because of this site's robots.txt – learn more.This is really affecting our clients SEO plan and we can't seem to remove the dev site or rank the live site in Google.Please can anyone help?
Technical SEO | | SO_UK0 -
How do I deal with Duplicate content?
Hi, I'm trying SEOMOZ and its saying that i've got loads of duplicate content. We provide phone numbers for cities all over the world, so have pages like this... https://www.keshercommunications.com/Romaniavoipnumbers.html https://www.keshercommunications.com/Icelandvoipnumbers.html etc etc. One for every country. The question is, how do I create pages for each one without it showing up as duplicate content? Each page is generated by the server, but Its impossible to write unique text for each one. Also, the competition seem to have done the same but google is listing all their pages when you search for 'DID Numbers. Look for DIDWW or MyDivert.
Technical SEO | | DanFromUK0 -
Techniques for diagnosing duplicate content
Buonjourno from Wetherby UK 🙂 Diagnosing duplicate content is a classic SEO skill but I'm curious to know what techniques other people use. Personally i use webmaster tools as illustrated here: http://i216.photobucket.com/albums/cc53/zymurgy_bucket/webmaster-tools-duplicate.jpg but what other techniques are effective? Thanks,
Technical SEO | | Nightwing
David0 -
Duplicate Content
Hi - We are due to launch a .com version of our site, with the ability to put prices into local currency, whereas our .co.uk site will be solely £. If the content on both the .com and .co.uk sites is the same (at product level mainly), will we be penalised? What is the best way to get around this?
Technical SEO | | swgolf1230 -
Google crawl index issue with our website...
Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server. In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder. An example: Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky. We called our hosting company who said the problem isn't coming from them... Has anyone had something like this happen to them? Thanks so much for your insight!
Technical SEO | | ILM_Marketing
Megan0 -
E-Commerce Duplicate Content
Hello all We have an e-commerce website with approximately 3,000 products. Many of the products are displayed in multiple categories which in turn generates a different URL! 😞 Accross the entire site I have noticed that the product pages are always outranked by competitors who have lower page authority, domain authority, total links etc etc. I am convinced this is down to duplicate content issues. I understand there is no direct penalty but how would this affect our rankings? Is page rank split between all the duplicates, which in turn lowers it's ranking potential? I have looked for a way to identify duplicate content using Google analytics but i've been unsuccessful. If the duplicate content is the issue and page rank is divided am i best using canonical or 301 redirects? Sorry if this is an obvious question but If i'm correct we could see a huge improvement in rankings accross the board. Wow! Cheers Todd
Technical SEO | | toddyC0