I have a ton of "duplicated content", "duplicated titles" in my website, solutions?
-
hi and thanks in advance, I have a Jomsocial site with 1000 users
it is highly customized and as a result of the customization we did some of the pages have 5 or more different types of URLS pointing to the same page.
Google has indexed 16.000 links already and the cowling report show a lot of duplicated content.
this links are important for some of the functionality and are dynamically created and will continue growing, my developers offered my to create rules in robots file so a big part of this links don't get indexed but Google webmaster tools post says the following:
"Google no longer recommends blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the
rel="canonical"
link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools."here is an example of the links:
|
| http://anxietysocialnet.com/profile/edit-profile/salocharly http://anxietysocialnet.com/salocharly/profile http://anxietysocialnet.com/profile/preferences/salocharly http://anxietysocialnet.com/profile/salocharly http://anxietysocialnet.com/profile/privacy/salocharly http://anxietysocialnet.com/profile/edit-details/salocharly http://anxietysocialnet.com/profile/change-profile-picture/salocharly |
|
so the question is, is this really that bad?? what are my options? it is really a good solution to set rules in robots so big chunks of the site don't get indexed? is there any other way i can resolve this?
Thanks again!
Salo
-
Usethe canonical,
Dont use robots,
when you block pages, link juice flows to those pages and is lost. you can use a meta tag no-index,follow, that way at least the links are still followed and return the link jucie.
But use the canonical that is what its for.
But the best thing, is not to have them, CMS sites often lead to a mess.
-
Duplicate content caused by having the same content on different URLs is still a big problem of Joomla. What I would recommend is to try to implement canonical tags on your site. This would allow all pages to be crawled and indexed, but the 'SEO metrics' would all be attributed to one single URL. Though that wouldn't strictly solve the duplicate content problem (which for example 301 redirects could, but I'm not sure that would apply in your situation), it would solve most of the SEO related fallout caused by all the duplicate content.
-
Yes, it really is that bad.
And you answered your own question with regards to fixing it: rel canonical. That is your best bet, and really not that hard to implement (not sure on Joomsocial, though). Google pretty much explains to you the downfalls of using the other options.
Other solutions would be Joomsocial-specific, and I have no experience with that platform. Perhaps someone else who does have it can chime in on that.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page with "random" content
Hi, I'm creating a page of 300+ in the near future, on which the content basicly will be unique as it can be. However, upon every refresh, also coming from a search engine refferer, i want the actual content such as listing 12 business to be displayed random upon every hit. So basicly we got 300+ nearby pages with unique content, and the overview of those "listings" as i might say, are being displayed randomly. Ive build an extensive script and i disabled any caching for PHP files in specific these pages, it works. But what about google? The content of the pages will still be as it is, it is more of the listings that are shuffled randomly to give every business listing a fair shot at a click and so on. Anyone experience with this? Ive tried a few things in the past, like a "Last update PHP Month" in the title which sometimes is'nt picked up very well.
Technical SEO | | Vanderlindemedia0 -
Quick Fix to "Duplicate page without canonical tag"?
When we pull up Google Search Console, in the Index Coverage section, under the category of Excluded, there is a sub-category called ‘Duplicate page without canonical tag’. The majority of the 665 pages in that section are from a test environment. If we were to include in the robots.txt file, a wildcard to cover every URL that started with the particular root URL ("www.domain.com/host/"), could we eliminate the majority of these errors? That solution is not one of the 5 or 6 recommended solutions that the Google Search Console Help section text suggests. It seems like a simple effective solution. Are we missing something?
Technical SEO | | CREW-MARKETING1 -
Duplicate content with "no results found" search result pages
We have a motorcycle classifieds section that lets users search for motorcycles for sale using various drop down menus to pick year-make-type-model-trim, etc.. These search results create urls such as:
Technical SEO | | seoninjaz
www.example.com/classifieds/search.php?vehicle_manufacturer=Triumph&vehicle_category=On-Off Road&vehicle_model=Tiger&vehicle_trim=800 XC ABS We understand that all of these URL varieties are considered unique URLs by Google. The issue is that we are getting duplicate content errors on the pages that have no results as they have no content to distinguish themselves from each other. A URL like:
www.example.com/classifieds/search.php?vehicle_manufacturer=Triumph&vehicle_category=Sportbike
and
www.example.com/classifieds/search.php?vehicle_manufacturer=Honda&vehicle_category=Streetbike Will have a results page that says "0 results found". I'm wondering how we can distinguish these "unique" pages better? Some thoughts:
-make sure <title>reflects what was search<br />-add a heading that may say "0 results found for Triumph On-Off Road Tiger 800 XC ABS"<br /><br />Can anyone please help out and lend some ideas in solving this? <br /><br />Thank you.</p></title>0 -
SEOMOZ and non-duplicate duplicate content
Hi all, Looking through the lovely SEOMOZ report, by far its biggest complaint is that of perceived duplicate content. Its hard to avoid given the nature of eCommerce sites that oestensibly list products in a consistent framework. Most advice about duplicate content is about canonicalisation, but thats not really relevant when you have two different products being perceived as the same. Thing is, I might have ignored it but google ignores about 40% of our site map for I suspect the same reason. Basically I dont want us to appear "Spammy". Actually we do go to a lot of time to photograph and put a little flavour text for each product (in progress). I guess my question is, that given over 700 products, why 300ish of them would be considered duplicates and the remaning not? Here is a URL and one of its "duplicates" according to the SEOMOZ report: http://www.1010direct.com/DGV-DD1165-970-53/details.aspx
Technical SEO | | fretts
http://www.1010direct.com/TDV-019-GOLD-50/details.aspx Thanks for any help people0 -
Instance IDs on "Events" in wordpress causing duplicate content
Hi all I use Yoast SEO on wordpress which does a pretty good job of insertint rel=canonical in to the header of the pages where approproate, including on my event pages. However my crawl diagnostics have highlighted these event pages as duplicate content and titles because of the instance id parameter being added to the URL. When I look at the pages head I see that rel=canonical is as it should be. Please see here for an example: http://solvencyiiwire.com/ai1ec_event/unintended-consequences-basel-ii-and-solvency-ii?instance_id= My question is how come SEOMoz is highlighting these pages as duplicate content and what can I do to remedy this. Is it because ?instance_id= is part of the string on the canonical link? How do I remove this? My client uses the following plugins "All-in-One Event Calendar by Timely" and
Technical SEO | | wellsgp
Google Calendar Events Many thanks!0 -
Issue: Duplicate Page Content
Hi All, I am getting warnings about duplicate page content. The pages are normally 'tag' pages. I have some blog posts tagged with multiple 'tags'. Does it really affect my site?. I am using wordpress and Yoast SEO plugin. Thanks
Technical SEO | | KLLC0 -
How to get rid of duplicate content
I have duplicate content that looks like http://deceptionbytes.com/component/mailto/?tmpl=component&link=932fea0640143bf08fe157d3570792a56dcc1284 - however I have 50 of these all with different numbers on the end. Does this affect the search engine optimization and how can I disallow this in my robots.txt file?
Technical SEO | | Mishelm1 -
Duplicate Home Page content and title ... Fix with a 301?
Hello everybody, I have the following erros after my first crawl: Duplicate Page Content http://www.peruviansoul.com http://www.peruviansoul.com/ http://www.peruviansoul.com/index.php?id=2 Duplicate Page title http://www.peruviansoul.com http://www.peruviansoul.com/ http://www.peruviansoul.com/index.php?id=2 Do you think I could fix them redirecting to http://www.peruviansoul.com with a couple of 301 in the .htaccess file? Thank you all for you help. Gustavo
Technical SEO | | peruviansoul0