Duplicate content issue
-
Hi everyone,
I have an issue determining what type of duplicate content I have.
www.example.com/index.php?mact=Calendar,m57663,default,1&m57663return_id=116&m57663detailpage=&m57663year=2011&m57663month=6&m57663day=19&m57663display=list&m57663return_link=1&m57663detail=1&m57663lang=en_GB&m57663returnid=116&page=116
Since I am not an coding expert, to me it looks like it is a URL parameter duplicate content. Is it?
At the same time "return_id" would makes me think it is a session id duplicate content. I am confused about how to determine different types of duplicate content, even by reading articles on Seomoz about it: http://www.seomoz.org/learn-seo/duplicate-content.
Could someone help me on how to recognize different types of duplicate content?
Thank you!
-
Thank you guys for being so helpful!!:)
-
Hello Jeff, I would like to say first that lots of sites have duplicate content problems. For the most part, this is not a huge issue. When search engines find duplicate content they choose one of the pages to list in the index, and then will ignore the other. This assumes, of course, that the nature of the duplicate content is not so bad that it would lead to the search engine wanting to ban you. This can happen if a review of your situation causes them to believe that you are deliberately trying to rank multiple times for the same search terms.
Here is a link that fixes the problem of duplicate content :
http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world
-
Let me try.
1. The answer to your first question is that it only matters if you're trying to figure out how to handle it programmaticaly. In this case you might have to ask the developer if this is being done by a session id. To me it looks more like a URL parameter, but without a live example I wouldnt know, could you provide the website in question? If not try visiting the website once, clear your cache and then visit again and see if the number after "return_id" changes. if it changes that is a session id. If it stays the same have a friend visit the website in the same manor and see if the number stays the same, if it changes then there's a good chance that this is a session id.
No matter if it's a session id adding it or not "return_id" is technically a URL parameter that is triggered by a session id.
2. The second question is still a bit vague, so let me see if this is correct. are you asking how to treat the duplicate content once you know what is causing it? If so, then follow these rules.
If the content changes significantly in the presence of the session id or parameter then this is not duplicate content. If the content does change do the following:
- make sure to use rel canonical for the root URL. In your example that would be: www.example.com/index.php?mact=Calendar
- set the URL parameters in Google and Bings webmaster tools to treat the parameter correctly.
- When the parameter or session id is present add the noindex, follow robots tag. this will allow the bots to spider through and pass on link juice in the event that someone links to your parameter versions
I think you have a larger issue, which is that your website's code is using the index.php to generate all of the pages, in the example that is calendar. This is a common mistake that programmers make since they work to do things as quickly and efficiently as possible. Its far easier to keep all of the code in the one file than to create several different dynamic files that work with each other.
If you dont have the ability to break this down and generate out different pages you might be able to use URL Rewrites to make browsers and bots think the URLs are actually different.
-
Thank you for your answers but I guess I didn't formulate properly my question.
My 1st question was: What kind of duplicate content is it?
- session id
- or url parameter
My second question is: How do you differentiate them? What do you look at when a duplicate content is a session id one or a url parameter issue?
-
You can determine if you have duplicate content several ways. search in google site:example.com and see how many pages google knows at your website. Also, when you are on page with this crazy url, open source code and see if a page has rel="canonical" tag. In your page that would be the best solution to signal robot that this is the same page as your index.php page.
Also, you can try Xenu. good and fast program to run your site on duplicates.
Hope it helps, you can show your website so we can take a look.
-
Hi Jeff,
index.php is the same as index.php?something=something&anotherthing=somethinglese
Each page should have a different url like index.php and page.php instead of always using index.php
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the process for allowing someone to publish a blog post on another site? (duplicate content issue?)
I have a client who allowed a related business to use a blog post from my clients site and reposted to the related businesses site. The problem is the post was copied word for word. There is an introduction and a link back to the website but not to the post itself. I now manage the related business as well. So I have creative control over both websites as well as SEO duties. What is the best practice for this type of blog post syndication? Can the content appear on both sites?
Technical SEO | | donsilvernail0 -
Duplicate content on job sites
Hi, I have a question regarding job boards. Many job advertisers will upload the same job description to multiple websites e.g. monster, gumtree, etc. This would therefore be viewed as duplicate content. What is the best way to handle this if we want to ensure our particular site ranks well? Thanks in advance for the help. H
Technical SEO | | HiteshP0 -
Gallery Causing Duplicate Content Issues
Hi! I have a gallery on my website. When you click to view the next image it goes to a new page but the content is exactly the same as the first page. This is flagging up a duplicate content issue. What is the best way to fix this? Add a canonical tag to page 2,3,4 or add a noindex tag? I have found a lot of conflicting answers. Thanks in advance
Technical SEO | | emma19860 -
How to avoid duplicate content when blogging from a site
I have a wordpress plastic surgery website. I have a wordpress blog on the site. My concern is avoiding duplicate content penalties when I blog. I use my blog to add new information about procedures that have pages on the same topic on the main site. Invariably same keywords and phrases can appear in the blog-will this be considered Duplicate content? Also is it black hat to insert anchor text in a blog linking back to site content-ie internal link or is one now and then helpful
Technical SEO | | wianno1680 -
Affiliate urls and duplicate content
Hi, What is the best way to get around having an affiliate program, and the affiliate links on your site showing as duplicate content?
Technical SEO | | Memoz0 -
Duplicate Page Content and Titles
A few weeks ago my error count went up for Duplicate Page Content and Titles. 4 errors in all. A week later the errors were gone... But now they are back. I made changes to the Webconfig over a month ago but nothing since. SEOmoz is telling me the duplicate content is this http://www.antiquebanknotes.com/ and http://www.antiquebanknotes.com Thanks for any advise! This is the relevant web.config. <rewrite><rules><rule name="CanonicalHostNameRule1"><match url="(.*)"><conditions><add input="{HTTP_HOST}" pattern="^www.antiquebanknotes.com$" negate="true"></add></conditions>
Technical SEO | | Banknotes
<action type="Redirect" url="<a href=" http:="" www.antiquebanknotes.com="" {r:1"="">http://www.antiquebanknotes.com/{R:1}" />
</action></match></rule>
<rule name="Default Page" enabled="true" stopprocessing="true"><match url="^default.aspx$"><conditions logicalgrouping="MatchAll"><add input="{REQUEST_METHOD}" pattern="GET"></add></conditions>
<action type="Redirect" url="/"></action></match></rule></rules></rewrite>0 -
How to Fix Duplicate Content Issue of Manufacturer Details Paragraph?
I am surviving with Google's crawling issue. Google had not index my product pages yet. I have Google a lot and read too many articles to get it done. But, I did not get satisfy answer with it. I just checked my product pages and found that: There is one tab with Manufacturers Details containing one paragraph. This content is available on too many product pages with same manufacturer. You can know more by visiting following URL. http://www.vistastores.com/indoorlighting-elklighting-d1472.html So, Does it matter to stop my crawling? If yes so How can I fix it?
Technical SEO | | CommercePundit0 -
Avoiding duplicate content/same pages
hi I have been checking through all the Q and A but i i'm still not sure how you get http://www.domain.co.uk/index.html to be just http://www.domain.co.uk/? Do you add canonical to the index page to point to the page you prefer and then add a 301 redirect? thanks
Technical SEO | | challen0