Duplicate Content Issue
-
Why do URL with .html or index.php at the end are annoying to the search engine? I heard it can create some duplicate content but I have no idea why? Could someone explain me why is that so?
Thank you
-
Using this as an example: If google showed duplicate content for http://www.domain.com/seo.html and http://www.domain.com/seo then you would want a 301 .htaccess redirect so anyone accessing the http://www.domain.com/seo.htm version of the page is automatically sent to http://www.domain.com/seo.
You can use the following in your .htaccess file.
RewriteRule ^/seo.html$ http://www.domain.com/seo/? [R=301,NC,L]
The benefit to having no file extension (.html) on the end of your URLs is it allows you to change the underlying framework or your website or even the programming language without the need for adding redirects each time you change (provided the website structure remains the same)
-
If there is the same page with the urls domain.com/seo & domain.com/seo.htm for this is what can be considered as duplicate content. With or without (better without) there should only be one version of the URL, to not split the link juice passing through. Hope this helps, Vahe
-
Agreed
-
There is also the problem with http://www.example.com and http:/example.com - they also generate duplicate content.
What you should do?
First - solve the http://www.example vs http://example issue
Edit your .htaccess file (if you don't have it, create one). Inside this file you need to input:RewriteEngine On RewriteCond %{HTTP_HOST} ^example.com RewriteRule (.*) http://www.example.com/$1 [R=301,L]
(where example.com is your websiteSecond - solve the index.html duplicate content
You need to include in the index.html file the follow metag inside the section:
This tag will tell google to forget the index.html and focus at www.example.com. So you will avoid the duplicate content without any problem
I hope I could help you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content in Shopify - subsequent pages in collections
Hello everyone! I hope an expert in this community can help me verify the canonical codes I'll add to our store is correct. Currently, in our Shopify store, the subsequent pages in the collections are not indexed by Google, however the canonical URL on these pages aren't pointing to the main collection page (page 1), e.g. The canonical URL of page 2, page 3 etc are used as canonical URLs instead of the first page of the collections. I have the canonical codes attached below, it would be much appreciated if an expert can urgently verify these codes are good to use and will solve the above issues? Thanks so much for your kind help in advance!! -----------------CODES BELOW--------------- <title><br /> {{ page_title }}{% if current_tags %} – tagged "{{ current_tags | join: ', ' }}"{% endif %}{% if current_page != 1 %} – Page {{ current_page }}{% endif %}{% unless page_title contains shop.name %} – {{ shop.name }}{% endunless %}<br /></title>
Intermediate & Advanced SEO | | ycnetpro101
{% if page_description %} {% endif %} {% if current_page != 1 %} {% else %} {% endif %}
{% if template == 'collection' %}{% if collection %}
{% if current_page == 1 %} {% endif %}
{% if template == 'product' %}{% if product %} {% endif %}
{% if template == 'collection' %}{% if collection %} {% endif %}0 -
Web accessibility - High Contrast web pages, duplicate content and SEO
Hi all, I'm working with a client who has various URL variations to display their content in High Contrast and Low Contrast. It feels like quite an old way of doing things. The URLs look like this: domain.com/bespoke-curtain-making/ - Default URL
Intermediate & Advanced SEO | | Bee159
domain.com/bespoke-curtain-making/?style=hc - High Contrast page
domain.com/bespoke-curtain-making/?style=lc - Low Contrast page My questions are: Surely this content is duplicate content according to a search engine Should the different versions have a meta noindex directive in the header? Is there a better way of serving these pages? Thanks.0 -
Duplicate content with URLs
Hi all, Do you think that is possible to have duplicate content issues because we provide a unique image with 5 different URLs ? In the HTML code pages, just one URL is provide. It's enough for that Google don't see the other URLs or not ? Example, in this article : http://www.parismatch.com/People/Kim-Kardashian-sa-securite-n-a-pas-de-prix-1092112 The same image is available on: http://cdn-parismatch.ladmedia.fr/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize1-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize2-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg http://resize3-parismatch.ladmedia.fr/img/var/news/storage/images/paris-match/people/kim-kardashian-sa-securite-n-a-pas-de-prix-1092112/15629236-1-fre-FR/Kim-Kardashian-sa-securite-n-a-pas-de-prix.jpg Thank you very much for your help. Julien
Intermediate & Advanced SEO | | Julien.Ferras0 -
Duplicate page content on numerical blog pages?
Hello everyone, I'm still relatively new at SEO and am still trying my best to learn. However, I have this persistent issue. My site is on WordPress and all of my blog pages e.g page one, page two etc are all coming up as duplicate content. Here are some URL examples of what I mean: http://3mil.co.uk/insights-web-design-blog/page/3/ http://3mil.co.uk/insights-web-design-blog/page/4/ Does anyone have any ideas? I have already no indexed categories and tags so it is not them. Any help would be appreciated. Thanks.
Intermediate & Advanced SEO | | 3mil0 -
Removing duplicate content
Due to URL changes and parameters on our ecommerce sites, we have a massive amount of duplicate pages indexed by google, sometimes up to 5 duplicate pages with different URLs. 1. We've instituted canonical tags site wide. 2. We are using the parameters function in Webmaster Tools. 3. We are using 301 redirects on all of the obsolete URLs 4. I have had many of the pages fetched so that Google can see and index the 301s and canonicals. 5. I created HTML sitemaps with the duplicate URLs, and had Google fetch and index the sitemap so that the dupes would get crawled and deindexed. None of these seems to be terribly effective. Google is indexing pages with parameters in spite of the parameter (clicksource) being called out in GWT. Pages with obsolete URLs are indexed in spite of them having 301 redirects. Google also appears to be ignoring many of our canonical tags as well, despite the pages being identical. Any ideas on how to clean up the mess?
Intermediate & Advanced SEO | | AMHC0 -
Duplicate page content errors stemming from CMS
Hello! We've recently relaunched (and completely restructured) our website. All looks well except for some duplicate content issues. Our internal CMS (custom) adds a /content/ to each page. Our development team has also set-up URLs to work without /content/. Is there a way I can tell Google that these are the same pages. I looked into the parameters tool, but that seemed more in-line with ecommerce and the like. Am I missing anything else?
Intermediate & Advanced SEO | | taylor.craig0 -
What is the better of 2 evils? Duplicate Product Descriptions or Thin Content?
It is quite labour intensive to come up with product descriptions for all of our product range ... +2500 products, in English and Spanish... When we started, we copy pasted manufacturer descriptions so they are not unique (on the web), plus some of them repeat each other - We are getting unique content written but its going to be a long process, so, what is the best of 2 evils, lots of duplicate non unique content or remove it and get a very small phrase from the database of unique thin content? Thanks!
Intermediate & Advanced SEO | | bjs20101 -
HTTPS Duplicate Content?
I just recieved a error notification because our website is both http and https. http://www.quicklearn.com & https://www.quicklearn.com. My tech tells me that this isn't actually a problem? Is that true? If not, how can I address the duplicate content issue?
Intermediate & Advanced SEO | | QuickLearnTraining0