Duplicate content http:// something .com and http:// something .com/
-
Hi,
I've just got a crawl report for a new wordpress blog with suffusion theme and yoast wordpress seo module and there is duplicate content for:
http:// something .com
and
http:// something .com/
I just can't figure out how to handle this. Can I add a redirect for .com/ to .com in htaccess?
Any help is appreciated!
By the way, the tag value for rel canonical is **http:// something .com/ **for both.
-
All so rember the canonicalization SEO advice: url canonicalization by MATT CUTTS on JANUARY 4, 2006 in GOOGLE/SEO (I got my power back!) Before I start collecting feedback on the Bigdaddy data center, I want to talk a little bit about canonicalization, www vs. non-www, redirects, duplicate urls, 302 “hijacking,” etc. so that we’re all on the same page. Q: What is a canonical url? Do you have to use such a weird word, anyway? A: Sorry that it’s a strange word; that’s what we call it around Google. Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages. For example, most people would consider these the same urls: www.example.com example.com/ www.example.com/index.html example.com/home.asp But technically all of these urls are different. A web server could return completely different content for all the urls above. When Google “canonicalizes” a url, we try to pick the url that seems like the best representative from that set. Q: So how do I make sure that Google picks the url that I want? A: One thing that helps is to pick the url that you want and use that url consistently across your entire site. For example, don’t make half of your links go to http://example.com/ and the other half go to http://www.example.com/ . Instead, pick the url you prefer and always use that format for your internal links. Q: Is there anything else I can do? A: Yes. Suppose you want your default url to be http://www.example.com/ . You can make your webserver so that if someone requests http://example.com/, it does a 301 (permanent) redirect to http://www.example.com/ . That helps Google know which url you prefer to be canonical. Adding a 301 redirect can be an especially good idea if your site changes often (e.g. dynamic content, a blog, etc.). Q: If I want to get rid of domain.com but keep www.domain.com, should I use the url removal tool to remove domain.com? A: No, definitely don’t do this. If you remove one of the www vs. non-www hostnames, it can end up removing your whole domain for six months. Definitely don’t do this. If you did use the url removal tool to remove your entire domain when you actually only wanted to remove the www or non-www version of your domain, do a reinclusion request and mention that you removed your entire domain by accident using the url removal tool and that you’d like it reincluded. Q: I noticed that you don’t do a 301 redirect on your site from the non-www to the www version, Matt. Why not? Are you stupid in the head? A: Actually, it’s on purpose. I noticed that several months ago but decided not to change it on my end or ask anyone at Google to fix it. I may add a 301 eventually, but for now it’s a helpful test case. Q: So when you say www vs. non-www, you’re talking about a type of canonicalization. Are there other ways that urls get canonicalized? A: Yes, there can be a lot, but most people never notice (or need to notice) them. Search engines can do things like keeping or removing trailing slashes, trying to convert urls with upper case to lower case, or removing session IDs from bulletin board or other software (many bulletin board software packages will work fine if you omit the session ID). Q: Let’s talk about the inurl: operator. Why does everyone think that if inurl:mydomain.com shows results that aren’t from mydomain.com, it must be hijacked? A: Many months ago, if you saw someresult.com/search2.php?url=mydomain.com, that would sometimes have content from mydomain. That could happen when the someresult.com url was a 302 redirect to mydomain.com and we decided to show a result from someresult.com. Since then, we’ve changed our heuristics to make showing the source url for 302 redirects much more rare. We are moving to a framework for handling redirects in which we will almost always show the destination url. Yahoo handles 302 redirects by usually showing the destination url, and we are in the middle of transitioning to a similar set of heuristics. Note that Yahoo reserves the right to have exceptions on redirect handling, and Google does too. Based on our analysis, we will show the source url for a 302 redirect less than half a percent of the time (basically, when we have strong reason to think the source url is correct). Q: Okay, how about supplemental results. Do supplemental results cause a penalty in Google? A: Nope. Q: I have some pages in the supplemental results that are old now. What should I do? A: I wouldn’t spend much effort on them. If the pages have moved, I would make sure that there’s a 301 redirect to the new location of pages. If the pages are truly gone, I’d make sure that you serve a 404 on those pages. After that, I wouldn’t put any more effort in. When Google eventually recrawls those pages, it will pick up the changes, but because it can take longer for us to crawl supplemental results, you might not see that update for a while. That’s about all I can think of for now. I’ll try to talk about some examples of 302′s and inurl: soon, to help make some of this more concrete. http://www.ragepank.com/articles/3/preventing-duplicate-content/ Hope I was of help, Thomas Von Zickell
-
thanks!
Can some body please also clarify exactly what should be in the second line:
As eyepaq wrote: RewriteRule ^(.+)/$ [%{HTTP_HOST}...] [R=301,L]
Should I insert something in/after "[%{HTTP_HOST}...]"?
-
After RewriteEngine if i'm not wrong
-
Should I keep the existing wordpress rewrite? If I keep it, should I then place your code before or after?
BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
END WordPress
-
Hi,
Google is pretty good in understanding that the trailing slash version is the same with the non-trailing slash version so you are safe on that side.
Even if the crawler said this is an issue it's not something you should focus on.
However, if you want to play by the book, you can httaccess it so it will 301 redirect to oen or another.
Bellow is a sample code:
#get rid of trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?example.com$ [NC]
RewriteRule ^(.+)/$ [%{HTTP_HOST}...] [R=301,L]Hope it helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does javascript generated content consider as regular content?
The website mentioned below, the content is generated using javascript, and content is something to do with Unicode char. The Unicode content creates as you scroll down. Will this content affect SEO https://www.myweirdtext.com/
On-Page Optimization | | teenmass423230 -
Duplicate content? - Ecommerce reviews loading the same products on every page
Hello there! I use a plugin on my ecom site that shows customer reviews - not product reviews but general shopping experience reviews. The plugin also loads links and short descriptions of products those customers bought. Having installed it site-wide, on every page there are short descriptions of the same products. Of course, as people leave new reviews the content changes (but it doesn't happen very often). So the question is: Is having links and short descriptions of the same products on every page harmful for SEO in this case? I'd be grateful for any insight into this matter.
On-Page Optimization | | thpchlk0 -
How to organise subpages for good SEO content without duplicate text?
We are working on many subpages for our services. We have original content for each page however there are few text which we need to always duplicate like: Contact sales window, why to choose us window, supported files etc. What's the best way to do this so it's not consider as duplicated text. Should we redirected it or add it as a picture and always change name of the picture? Thank you Lukas
On-Page Optimization | | Lukas-ST0 -
Where to add new content
I run a vBulletin website and vBulletin isnt very SEO friendly. I do fairly well in Google for most of my keywords, but forums dont necessarily build strong page authority etc. My site deals with fishing reports across the state of VA and drives 15-18k sessions a month and close to 100,000 page views a month based on Google Analytics. I want to start targeting new keywords and I am concerned about vBulletin inability to be SEO friendly. Many of my new keywords arent dynamic like fishing reports that are added by members daily. These are more like campgrounds, marinas etc. My thought is to install a Wordpress blog and build out this content so I can efficiently deal with on page SEO. the vBulletin software is installed in the root so I would install wordpress in something like mydomain/lake123/ Is the right thing to do, and will google see multiple sitemaps (one for vbulletin and another for wordpress) and index appropriately? Am I missing something major here? Thanks ~ Brian
On-Page Optimization | | FCBCO0 -
Duplicate content issue
Hello, I got duplicate content issue on my home page : examplesite.com
On-Page Optimization | | digitalkiddie
examplesite.com/index.html Those page urls are with duplicate content. If in index.html i use 301 redirect like that : Header( "HTTP/1.1 301 Moved Permanently" );
Header( "Location: http://examplesite.com" );
?> would i loose any page authority ? sorry for the newbie question0 -
Notonthehighstreet.co.uk - duplicate content? a reason to not sell via 3rd parties
A mixture of questions and discussion Question 1. can the following two pages be considered duplicate content http://www.notonthehighstreet.com/gardenbeet/product/deer-head-wall-art http://www.notonthehighstreet.com/1/1/219933-deer-head-wall-art-by-garden-beet.html both pages are indexed and both pages have different meta - aimed at different search combinations Discussion The search for 'deer head wall art gardenbeet' is generated by my PR company - we have done loads of print advertising for this item yet the sheer mass and volume of noths.com stops my store http://www.gardenbeet.com/garden-wall-art/58-deer-head.html from obtaining the number one position. All is fair in the business world I suppose BUT the original marketing machine for noths.com was claiming that they were assisting the small business owner. I paid them over £600 to join and now they compete with me head on. Stupid me I suppose. Let this be a key learning for those toying with the idea of investing in their own SEO or a 3rd party selling platform. Ho hum
On-Page Optimization | | GardenBeet0 -
Duplicate Page Content and Duplicate Page Title
Hi All, I'm new in SEOMoz and have some questions after I have already spend 2-3 days trying to resolve the problems identified from Crawling one of my clients websites. I get quite a lot of Duplicate Page Conntent and Page Titles warnings and trying to find a workaround through the forums and posts. I continuously get this error on most of my pages: URL: http://domain.com/benefits with the same Page but with a WWW in front URL: http://www.domain.com/benefits Any advice will be highly appreciated. Thanks, Athos
On-Page Optimization | | athosk0 -
How much constitutes duplicate content in your opinion?
Mornin' In your experience, how much constitutes duplicate content? A sentence, a paragraph, half a page, etc? What about quotes - are they considered duplications, too, if there aren't quotation marks? Over the years, the client has been a bit bad in taking a paragraph from here, a sentence from there, and coupling it all together as daily news on their site. I'm now in the middle of a purge. Oh boy! All hail originality.
On-Page Optimization | | Martin_S0