Duplicated Content with joomla multi language website
-
Dear Seomoz Community
I am running a multi language joomla website (www.siam2nite.com) with 2 active languages.
The first and primary language is english. the second language is thai. Most of the content (articles, event descriptions ...) is in english only.
What we did is a thai translation for the navigation bars, headers, titles etc (translation of all joomla language files) those texts are static and only help the user navigate / understand our site in their thai language.
Now I facing a problem with duplicated content. Lets take our Q&A component as example.
the url structure looks like this:
english - www.siam2nite.com/en/questions/
thai - www.siam2nite.com/th/questions/
Every question asked will create two URL, one for each language. The content itself (user questions & answers) is identical on both URL's. Only the GUI language is different. If you take a look at this question you will understand what i mean:
ENGLISH VERSION:
http://www.siam2nite.com/en/questions/where-to-celebrate-halloween-in-bangkok
THAI VERSION:
http://www.siam2nite.com/th/questions/where-to-celebrate-halloween-in-bangkok
As you can see each page has a unique title (H1) and introduction text in the correct language (same for menu, buttons, etc.) but the questions and answers are only available in one language.
Now my question
I guess Google will see this pages as duplicated content. How should I proceed with this problem:
- put all thai links /th/questions/ in the robots.txt and block them
or
- make a canonical tag for the english versions?
Not sure if I set a canonical tag google will still index the thai title and introduction texts (they have important thai keywords in them)
Would really appreciate your help on this
Regards,
Menelik
-
Hi John
Sorry for my late response ;-(
Thank you very much for your help. I added a rel=alternate for the Thai version as well. So far it looks good - no duplicated content.
Regards,
Menelik
-
The Google Webmaster set up sounds right to me!
You should set the rel alternate on all pages that go back and forth, not just the English pages. That way if Google wants to return a Thai page to an English searcher, it'll know to reference the English page. This is the set up Google recommends in their help documentation.
Don't worry about a new sitemap for the /th/ pages. Your current set up should be fine.
-
Hi John
Thank you very much for your answer. I did not know about the rel=alternate tag until today
Following your advise I modified the joomla header and now on every english page /en/... their is a rel=alternate link to the thai version.
for example:
http://www.siam2nite.com/en/magazine now has the following tag:
<link href="http://www.siam2nite.com/th/magazine" hreflang="th" rel="alternate">
Regarding the webmaster help (link you mentioned) I do not need to set a tag on the thai pages targeting the english ones correct? Just one rel=alternate on the english pages should make it right?
I tried to follow your advise with Google webmaster as well. My current configuration looks like this:
My old already existing site:
1 Site: www.siam2nite.com (no geo-targeting)
Today I created a new one
2. Site: www.siam2nite.com/th/ (geo-targeting: Thailand)
Is this the setup you meant in your answer?
I did not submit a sitemap for the 2nd site as all links (thai and english) are already included in the sitemap I use on the 1 site. Should I split my old sitemap and submit one for each site containing only the correct language links?
Thank you very much for your kind support - really appreciate it
-
The proper way to handle this is with rel=alternate hreflang tags. This will tell Google the content is the same, but in different languages. See http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077 for more info. You can place meta tags on each page, or do it in your sitemap.
Other things you can do to help search engines get it right is to set up a profile in Google Webmaster Tools for each of the directories (or at least for the Thai one), and set the geotargeting. For Bing, they prefer you set the country and language on each page (see here).
If you block the pages with robots.txt or use canonical tags, you're telling Google not to include those pages in SERPs. It sounds like you want the Thai pages to appear in Thai results, and the English pages in English SERPs, so I wouldn't do that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content in sidebar
Hi guys. So I have a few sentences (about 50 words) of duplicate content across all pages of my website (this is a repeatable text in sidebar). Each page of my website contains about 1300 words (unique content) in total, and 50 words of duplicate content in sidebar. Does having a duplicate content of this length in sidebar affect the rankings of my website in any way? Thank you so much for your replies.
On-Page Optimization | | AslanBarselinov1 -
Duplicate content query
I'm currently reauthoring all of the product pages on our site. Within the redesign of all the pages is a set of "why choose us?" bullet points and "what our customers say" bullet points. On every page these bullet points are the same. We currently have 18% duplicate content sitewide and I'm reluctant to push this. The products are similar but targeted at different professions, so I'm not sure whether to alter the text slightly for the bullet points on each page, remove the bullet points entirely or implement some form of canonicalisation that won't impact the profession-specific pages' ability to rank well.
On-Page Optimization | | EdLongley0 -
Duplicate content with tagging and categories
Hello, Moz is showing that a site has duplicate content - which appears to be because of tags and categories. It is a relatively new site, with only a few blog publications so far. This means that the same articles are displayed under a number of different tags and categories... Is this something I should worry about, or just wait until I have more content? The 'tag' and 'category' pages are not really pages I would expect or aim for anyone to find in google results anyway. Would be glad to here any advice / opinions on this Thanks!
On-Page Optimization | | wearehappymedia1 -
Unique Pages with Thin Content vs. One Page with Lots of Content
Is there anyone who can give me a definitive answer on which of the following situations is preferable from an SEO standpoint for the services section of a website? 1. Many unique and targeted service pages with the primary keyword in the URL, Title tag and H1 - but with the tradeoff of having thin content on the page (i.e. 100 words of content or less). 2. One large service page listing all services in the content. Primary keyword for URL, title tag and H1 would be something like "(company name) services" and each service would be in the H2 title. In this case, there is lots of content on the page. Yes, the ideal situation would be to beef up content for each unique pages, but we have found that this isn't always an option based on the amount of time a client has dedicated to a project.
On-Page Optimization | | RCDesign741 -
Over-Optimized Website
I'm looking for advice for what you would start with if you were working on a website that was extremely over-optimized for 1 keyword. So, for example, I'm going to pretend this client is a dog trainer in Toronto (I can't publicly post the URL). I've read places that having exact-match anchor text links to inner pages in the footer of the site can cause problems and removing it has resulted in big ranking jumps. I'm looking to see if there are other big items that you would tackle first if this was your client. Some examples of things the site has: There is a page for dog training under their Services menu. However, internal links on their site link "dog training" to both the homepage and to this service page. Is that going to cause issues? The anchor text for internal linking is almost always the exact same word - "Dog Training". There is a banner that goes across the top of the site that appears on every page that says "Dog Training Toronto". I'm guessing I should remove that. Would the same keyword being overly used on every page cause confusion? Almost every image on his site is saved in the format "Dog Training Toronto". I'm looking to see if anyone has general tips on where to start with a site that has been over-optimized for 1 keyword. He actually has a ton of good content on his blog that gets a ton of traffic (because it's actually useful) so it's not that his content sucks - it's just been overly structured and SEO'd to death. I found a few articles on this but other than the footer advice I didn't find too many case studies of others that have run into this issue and done a few steps that actually worked.
On-Page Optimization | | ImprezzioMarketing0 -
Duplicate content issues - page content and store URLs
Hi, I'm experiencing some heavy duplicate content Crawl errors on Moz with www.redrockdecals.com and therefore I really need some help. It brings up different connections between products and I'm having a hard time figuring out what it means. It is listing the same products as duplicate content but they have different URL endings. For example:http://www.redrockdecals.com/car-graphics/chevrolet-silverado?___store=nl&___from_store=us
On-Page Optimization | | speedbird1229
&
http://www.redrockdecals.com/car-graphics/chevrolet-silverado?___store=d&___from_store=us It seems like Moz considers the copy-pasted parts in the Full Description (scrolled a bit down on product pages) as Duplicate Content. For example the general text found on this page: http://www.redrockdecals.com/caution-tow-limited-turning-radius-decal Or this page: http://www.redrockdecals.com/if-you-don-t-succeed-first-time-then-skydiving-isn-t-for-you-bumper-sticker I am planning to write new and unique descriptions for all products but what do you suggest - should I either remove the long same descriptions or just shorten them perhaps so they don't outweigh the short but unique descriptions above? I've heard search engines understand that some parts of the page can be same on other pages but I wonder if in my case this has gone too deep... Thanks so much!0 -
Long list of companies spread out over several pages - duplicate content?
Hi all, I am currently working with a company formation agent. They have a list of every limited company spread over hundreds of pages. What do you guys think? Is there a need for Canonicals? The website is ranking pretty well but I want to make sure there aren't any problems in the future. Here are two pages as examples: http://www.formationsdirect.com/companysearchlist.aspx?start=MULLAGHBOY+CONSTRUCTION+LIMITED&next=1# http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1# Also what about the actual company pages? See an example below http://www.formationsdirect.com/companysearchlist.aspx?name=AMNA+CONSTRUCTION+LTD&number=06630333#.U8PW6_ldX1s Thanks in advance Aaron
On-Page Optimization | | AaronGro0 -
Your Comments on my Website Please
Please post constructive comments about my website: http://www.thewebhostinghero.com Some of the stuff you may want to look at: Free tools: http://www.thewebhostinghero.com/free-tools Web host lookup: http://www.thewebhostinghero.com/who_is_hosting.php Articles, news, tutorials: http://www.thewebhostinghero.com/blog Facebook: https://www.facebook.com/WebHostingHero I'm still weighting the option of building a new website and if I do, I want to make sure I won't do the same mistakes again. Don't be scared to be brutal... but constructive. Thank you Stephane
On-Page Optimization | | sbrault740