Duplicate content http:// something .com and http:// something .com/
-
Hi,
I've just got a crawl report for a new wordpress blog with suffusion theme and yoast wordpress seo module and there is duplicate content for:
http:// something .com
and
http:// something .com/
I just can't figure out how to handle this. Can I add a redirect for .com/ to .com in htaccess?
Any help is appreciated!
By the way, the tag value for rel canonical is **http:// something .com/ **for both.
-
All so rember the canonicalization SEO advice: url canonicalization by MATT CUTTS on JANUARY 4, 2006 in GOOGLE/SEO (I got my power back!) Before I start collecting feedback on the Bigdaddy data center, I want to talk a little bit about canonicalization, www vs. non-www, redirects, duplicate urls, 302 “hijacking,” etc. so that we’re all on the same page. Q: What is a canonical url? Do you have to use such a weird word, anyway? A: Sorry that it’s a strange word; that’s what we call it around Google. Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages. For example, most people would consider these the same urls: www.example.com example.com/ www.example.com/index.html example.com/home.asp But technically all of these urls are different. A web server could return completely different content for all the urls above. When Google “canonicalizes” a url, we try to pick the url that seems like the best representative from that set. Q: So how do I make sure that Google picks the url that I want? A: One thing that helps is to pick the url that you want and use that url consistently across your entire site. For example, don’t make half of your links go to http://example.com/ and the other half go to http://www.example.com/ . Instead, pick the url you prefer and always use that format for your internal links. Q: Is there anything else I can do? A: Yes. Suppose you want your default url to be http://www.example.com/ . You can make your webserver so that if someone requests http://example.com/, it does a 301 (permanent) redirect to http://www.example.com/ . That helps Google know which url you prefer to be canonical. Adding a 301 redirect can be an especially good idea if your site changes often (e.g. dynamic content, a blog, etc.). Q: If I want to get rid of domain.com but keep www.domain.com, should I use the url removal tool to remove domain.com? A: No, definitely don’t do this. If you remove one of the www vs. non-www hostnames, it can end up removing your whole domain for six months. Definitely don’t do this. If you did use the url removal tool to remove your entire domain when you actually only wanted to remove the www or non-www version of your domain, do a reinclusion request and mention that you removed your entire domain by accident using the url removal tool and that you’d like it reincluded. Q: I noticed that you don’t do a 301 redirect on your site from the non-www to the www version, Matt. Why not? Are you stupid in the head? A: Actually, it’s on purpose. I noticed that several months ago but decided not to change it on my end or ask anyone at Google to fix it. I may add a 301 eventually, but for now it’s a helpful test case. Q: So when you say www vs. non-www, you’re talking about a type of canonicalization. Are there other ways that urls get canonicalized? A: Yes, there can be a lot, but most people never notice (or need to notice) them. Search engines can do things like keeping or removing trailing slashes, trying to convert urls with upper case to lower case, or removing session IDs from bulletin board or other software (many bulletin board software packages will work fine if you omit the session ID). Q: Let’s talk about the inurl: operator. Why does everyone think that if inurl:mydomain.com shows results that aren’t from mydomain.com, it must be hijacked? A: Many months ago, if you saw someresult.com/search2.php?url=mydomain.com, that would sometimes have content from mydomain. That could happen when the someresult.com url was a 302 redirect to mydomain.com and we decided to show a result from someresult.com. Since then, we’ve changed our heuristics to make showing the source url for 302 redirects much more rare. We are moving to a framework for handling redirects in which we will almost always show the destination url. Yahoo handles 302 redirects by usually showing the destination url, and we are in the middle of transitioning to a similar set of heuristics. Note that Yahoo reserves the right to have exceptions on redirect handling, and Google does too. Based on our analysis, we will show the source url for a 302 redirect less than half a percent of the time (basically, when we have strong reason to think the source url is correct). Q: Okay, how about supplemental results. Do supplemental results cause a penalty in Google? A: Nope. Q: I have some pages in the supplemental results that are old now. What should I do? A: I wouldn’t spend much effort on them. If the pages have moved, I would make sure that there’s a 301 redirect to the new location of pages. If the pages are truly gone, I’d make sure that you serve a 404 on those pages. After that, I wouldn’t put any more effort in. When Google eventually recrawls those pages, it will pick up the changes, but because it can take longer for us to crawl supplemental results, you might not see that update for a while. That’s about all I can think of for now. I’ll try to talk about some examples of 302′s and inurl: soon, to help make some of this more concrete. http://www.ragepank.com/articles/3/preventing-duplicate-content/ Hope I was of help, Thomas Von Zickell
-
thanks!
Can some body please also clarify exactly what should be in the second line:
As eyepaq wrote: RewriteRule ^(.+)/$ [%{HTTP_HOST}...] [R=301,L]
Should I insert something in/after "[%{HTTP_HOST}...]"?
-
After RewriteEngine if i'm not wrong
-
Should I keep the existing wordpress rewrite? If I keep it, should I then place your code before or after?
BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
END WordPress
-
Hi,
Google is pretty good in understanding that the trailing slash version is the same with the non-trailing slash version so you are safe on that side.
Even if the crawler said this is an issue it's not something you should focus on.
However, if you want to play by the book, you can httaccess it so it will 301 redirect to oen or another.
Bellow is a sample code:
#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?example.com$ [NC]
RewriteRule ^(.+)/$ [%{HTTP_HOST}...] [R=301,L]Hope it helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I fix my portfolio causing duplicate content issues?
Hi, Im new to this whole duplicate content issue. I have a website, fatcatpaperie.com that I use the portofolio feature in Wordpress as my gallery for all my wedding invitations. I have a ton of duplicate content issues from this. I don't understand at all how to fix this. I'd appreciate any help! Below is an example of one duplicate content issue. They have slightly different names, different urls, different images and all have no text. But are coming up as duplicates. Would it be as easy as putting a different metadescription for each?? Thanks for the help! Rena | "Treasure" by Designers Fine Press - Fat Cat Paperie http://fatcatpaperie.com/portfolio-item/treasure-designers-fine-press 1 0 0 0 200 3 duplicates "Perennial" by Designers Fine Press - Fat Cat Paperie http://fatcatpaperie.com/portfolio-item/perennial-by-designers-fine-press 1 0 0 0 200 1 of 3 duplicates "Primrose" by Designers Fine Press - Fat Cat Paperie http://fatcatpaperie.com/portfolio-item/8675 1 0 0 0 200 2 of 3 duplicates "Catalina" by Designers Fine Press - Fat Cat Paperie http://fatcatpaperie.com/portfolio-item/catalina-designers-fine-press |
On-Page Optimization | | HonestSEOStudio0 -
My home page URL http://seadwellers.com/ redirects to http://www.seadwellers.com/. Is this a problem?
"The URL http://seadwellers.com/ redirects to http://www.seadwellers.com/. Do you want to crawl http://www.seadwellers.com/ instead?" I was given this when I tried to crawl my home page using MOZ software. I was not aware of this, do not know if it could be a problem concerning any aspect of SEO, etc? :
On-Page Optimization | | sdwellers0 -
What to do about resellers duplicating content?
Just went through a big redevelopment for a client and now have fresh images and updated content but now all the resellers have just grabbed the new images/content and pasted them on their own site. My client is a manufacture that sells directly online and over the phone for large orders. I'm just not sure how to handle the resellers duplicate content. Any thoughts on this? Am I being silly for worrying about this?
On-Page Optimization | | ericnkatz0 -
Duplicate content with mine other websites
Hi,
On-Page Optimization | | JohnHuynh
I have a primary website www.vietnamvisacorp.com and beside that I also have some secondary websites with have same contents with primary website. This lead to duplicate content errors. What the best solution for this issue? Please help! p/s: I am a webmaster of all websites duplicate content Thank you!0 -
Duplicate content list by SEOMOZ
Hi Friends, I am seeing lot of duplicate (about 10%) from the crawl report of SEOMOZ. The report says, "Duplicate Page Content" But the urls it listed have different title, different url and also different content. I am not sure how to fix this issue.. My site has both Indian cinema news and photo gallery. The problme mainly coming in photo gallery posts. for example: this is the main url of a post. apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos . But in this post, each image is a link to its enlarged images (default wordpress). The problem is coming with each individual image with in this post. examples of SEOMOZ report 3 individual urls as duplicate content...from the same above post.: http://apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos/poonam-kaur-hot-photo-shoot-stills-4 http://apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos/poonam-kaur-hot-photo-shoot-stills-3 http://apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos/poonam-kaur-hot-photo-shoot-stills-2 Some body please advise me.. Appreciate your help.
On-Page Optimization | | ksnath0 -
Localised content/pages for identical products
I've got a question about localising the website of a nationwide company. We're a small dance school with nationwide (40 cities) coverage for around 40 products. Currently, we have one page for each product (style of dance), and one page for each city; the product pages cover keywords like 'cheerleading dance class' while the city pages target the 'london dance classes'-type keywords. To make 'localised product pages', I feel like we should make a page for every city/product combo 'London cheerleading classes' - but that seems like a nightmare for both writing sexy & original content, and link building/social stats. The other thing I can think of (which I refuse to do because it would look stupid & flag the page as keyword stuffed) is filling the page with the keyword phrases which are appropriate for every city. Is there another way to let google know 'this page is appropriate for these cities...'? We do currently list the cities a product is available in, but it doesn't seem to help local rankings very much. Would this just be a link building job, using hyper-targeted anchor texts (inc. city names) for each product? How do the pro's tackle this problem?
On-Page Optimization | | AlecPR0 -
Can duplicate content issues be solved with a noindex robot metatag?
Hi all I have a number of duplicate content issues arising from a recent crawl diagnostics report. Would using a robots meta tag (like below) on the pages I don't necessarily mind not being indexed be an effective way to solve the problem? Thanks for any / all replies
On-Page Optimization | | joeprice0 -
WordPress (.com) and SEO
I am in my 30 day trial and very interested in my results. I think I am probably in a small minority in having the same web site up and running for approaching 17 years (registered in January 1995 :)) but only now am I looking at SEO seriously (to the extent that I want to learn more myself, as opposed to having others promise great fortune!)). Anyway, before committing to SEOMoz on an ongoing basis I want to understand just how actionable the information on my dashboard is. With that in mind, here's the first of what is (hopefully) a series questions that about low-hanging fruit I might be able to check off quickly. I recently brought up a new blog on WordPress.com (note - hosted by WordPress, not a self-hosted implementation). I have had this blog running for less than a month and have just 18 posts. And I am being overwhelmed with thoudands of errors/warnings from SEOMoz. These fall into a few categories: Duplicate content: As I understand it, each TAG I associate with a single blog post creates a unique URL. For example, if I have a single post with tags for "flowers", "wine" and "cakes", I get URLs generated such as <blog url="">/flowers, <blog url="">/wines and <blog url="">/cakes. Obviously, tagging posts is a common scenario. Must I just accept these duplicate content warnings?</blog></blog></blog> Title element too long: These are self generated by WordPress.com and the default format includes the date the post was submitted (which takes a bunch of characters followed by the title used). Many of the posts are well over 70 and this seems really easy to do. Missing meta-description: As far as I can tell, Wordpress.com doesn't give me an option to specify these. So, must I just accept these issues if I use WordPress.com (which, again, seems like a very common scenario) and how negative is this to me? Thanks. Mark
On-Page Optimization | | MarkWill0