How to remove hundreds of duplicate pages
-
Hi - while i was checking duplicate links, am finding hundreds of duplicates pages :-
-
having undefined after domain name and before sub page url
-
having /%5C%22/ after domain name and before the sub page url
-
Due to Pagination limits
Its a joomla site - http://www.mycarhelpline.com
Any suggestions - shall we use:-
-
301 redirect
-
leave these as standdstill
-
and what to do of pagination pages (shall we create a separate title tag n meta description of every pagination page as unique one)
thanks
-
-
Okay, I took a look at the plugin Ben recommended, and took another look at the SH404SEF one. The free one Ben recommended (http://extensions.joomla.org/extensions/site-management/sef/1063) looks like it can help out with some duplicate content - but what I recommend doing is getting the SH404SEF here http://anything-digital.com/sh404sef/features.html because it allows for setting up canonical tags and also gives you the option to add the rel=next feature to your paginated pages, which is one of your problem areas.
One thing I noticed though is that it specifically states it "automatically adds canonical tags to non-html pages" - so that means it will apply it automatically to Joomla's defaul pdf view, etc. While this is helpful, it may not solve the full issue of your duplicate pages with the undefined and "/%5C%22/" issue.
It does however state that it "removes duplicate URLs" - how it identifies and removes these, I am not sure. You may want to try it out because it is useful for other optimization tasks - or contact the owner for more information.
If the tool doesn't recognize and remove the duplicate pages caused by /undefined/ and "/%5C%22/" then you should disallow crawling of these in your robots.txt file. While you are in your robots.txt file you should remove the /images/ because you want those to be crawled - Joomla adds that in by default.
Because a lot of these pages have already been crawled, you should do a 301 on the duplicate pages to their matching page. This sounds like it will be a long process - this may be aided by the sh404sef plugin - not sure.
I just want to also add that I am in no way affiliated with any of these plugins.
-
The only way to solve the duplication error you are getting is to make the URL's distinct. Googlebot comes to your site and looks at the URL's and if they are not distinct it may not index them very well. I understand your site is showing up fine in the SERP's so this may be one of those items you place on a lower priority until later.
I think R.May knows Joomla so I'll refer to him on how to accomplish this but it may be worth it to make the adjustment. You may find the end result of making your page URL more distinct will actually increase your current SERPs. Just a thought.
Other than that. If your site isn't hurting and the only thing you are concerned about is the report in SEOmoz then I would move on and just make a mental note of it for later.
-
Hi ben - changing url is not well required as the site is getting good serp, however - the duplicacy issue to saveguard us from any future issue - is what we seek for
-
Hi - thanks for replying
-
For the dynamic url - Yes - at the initial start - it was missed and as on its not reqd somehow - as the pages are getting indexed well and good in SERP
-
For pagination - Where we needs this is like in our used car section, discount section& news section where multiple pages are created. shall we create separate title & meta description for every pagination page. is it ideally reqd ?
http://www.mycarhelpline.com/index.php?option=com_usedcar&view=category&Itemid=3
- 'undefined' & /%5C%22/ is coming as per report of SEOmoz and is almost on every page of site (except of home page) with the dynamic url after domain name are preceded with these 2 strings as per moz report
how to get this corrected - want to be preventive from this duplicacy n avoid getting a hit in future even if its going well now -
-
-
I'm not a Joomla expert but to make your URL's search engine friendly you are going to need to add an extension like this. That will allow you to make more distinct URLs that will not be considered "duplicate" anymore.
-
Joomla has soo many dup content issues, you have to know Joomla really well to avoid most of them. The biggest issue is you didn't enable the SEF URLs from the start and left the default index.php?option=com on most of them, which stuffs your URLs full of ugly parameters.
You can still enable this in your global options and with a quick edit to htaccess - but it will change all of your current URLs and you will need to 301 all of them, so that isn't a great option unless you are really suffering - and depending on if you are using J 1.6 or under, this is a time consuming nasty process. Also this is unlikely to get rid of any existing duplicate pages - but may make dealing with them and finding them easier.
I don't see the specific examples you posted though, where are you seeing "undefined" and "%5C%22/ " ?
You should implement rel=canonical on the correct version of each page. I recommend SH404SEF which is a Joomla plugin and makes this process easier - but it isn't free. I don't know of a good free plugin that does this, and Joomla's templates make doing this manually difficult.
Looking at it quickly, I also didn't notice any articles that were paginated, but you should try to follow the rel="next" and rel="prev" for paginated pages. This is likely something you will have to edit your Joomla core files to do.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Web accessibility - High Contrast web pages, duplicate content and SEO
Hi all, I'm working with a client who has various URL variations to display their content in High Contrast and Low Contrast. It feels like quite an old way of doing things. The URLs look like this: domain.com/bespoke-curtain-making/ - Default URL
Intermediate & Advanced SEO | | Bee159
domain.com/bespoke-curtain-making/?style=hc - High Contrast page
domain.com/bespoke-curtain-making/?style=lc - Low Contrast page My questions are: Surely this content is duplicate content according to a search engine Should the different versions have a meta noindex directive in the header? Is there a better way of serving these pages? Thanks.0 -
What is considered duplicate content?
Hi, We are working on a product page for bespoke camper vans: http://www.broadlane.co.uk/campervans/vw-campers/bespoke-campers . At the moment there is only one page but we are planning add similar pages for other brands of camper vans. Each page will receive its specifically targeted content however the 'Model choice' cart at the bottom (giving you the choice to select the internal structure of the van) will remain the same across all pages. Will this be considered as duplicate content? And if this is a case, what would be the ideal solution to limit penalty risk: A rel canonical tag seems wrong for this, as there is no original item as such. Would an iFrame around the 'model choice' enable us to isolate the content from being indexed at the same time than the page? Thanks, Celine
Intermediate & Advanced SEO | | A_Q0 -
Base copy on 1 page, then adding a bit more for another page - potential duplicate content. What to do?
Hi all, We're creating a section for a client that is based on road trips - for example, New York to Toronto. We have a 3 day trip, a 5 day trip, a 7 day trip and a 10 day trip. The 3 day trip is the base, and then for the 5 day trip, we add another couple of stops, for the 7 day trip, we add a couple more stops and then for the 10 day trip, there might be two or three times the number of stops of the initial 3 day trip. However, the base content is similar - you start at New York, you finish in Toronto, you likely go through Niagara on all trips. It's not exact duplicate content, but it's similar content. I'm not sure how to look after it? The thoughts we have are:1) Use canonical tags 3,5,7 day trips to the 10 day trip.
Intermediate & Advanced SEO | | digitalhothouse
2) It's not exactly duplicate content, so just go with the content as it is We don't want to get hit by any penalty for duplicate content so just want to work out what you guys think is the best way to go about this. Thanks in advance!0 -
YouTube Page
Hi All, I am new here but already I can see that SEOmoz is a great place for SEO 🙂 I need advice... We have one client that have 100.000 views per day on their YouTube channel! Now they have about 15.000 per day and ask us what we can do with SEO for their YouTube channel. Thanks for help! All The Best, Sanel
Intermediate & Advanced SEO | | FighterSpirit0 -
Ecommerce: remove duplicate product pages or use rel=canonical
Say we have a white-widget that is in our white widget collection and also in our wedding widget collection. Currently, we have 3 different URLs for that product (white-widgets/white-widget and wedding-widgets/white-widget and all-widgets/white-widget).We are automatically generating a rel=canonical tag for those individual collection product pages that canonical the original product page (/all-widgets/white-widget). This guide says that is the structure Zappos uses and says "There is an elegance to this approach. However, I would re-visit it today in light of changes in the SEO world."
Intermediate & Advanced SEO | | birchlore
I noticed that Zappos, and many other shops now actually just link back to the parent product page (e.g. If I am in wedding widget section and click on the widget, I go to all-products/white-widget instead of wedding-widgets/white-widget).So my question is:Should we even have these individual product URLs or just get rid of them altogether? My original thought was that it would help SEO for search term "white wedding widget" to have a product URL wedding-widget/white-widget but we won't even be taking advantage of that by using rel=canonical anyway.0 -
Google is displaying my pages path instead of URLS (Pages name)
Does anyone knows why Google is displaying my pages path instead of the URL in the search results, i discoverd that while am searching using a keyword of mine then i copied the link http://www.smarttouch.me/services-saudi/web-services/web-design and found all related results are the same, could anyone one tell me why is that and is it really differs? or the URL display is more important than the Path display for SEO!
Intermediate & Advanced SEO | | ali8810 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0 -
Should the sitemap include just menu pages or all pages site wide?
I have a Drupal site that utilizes Solr, with 10 menu pages and about 4,000 pages of content. Redoing a few things and we'll need to revamp the sitemap. Typically I'd jam all pages into a single sitemap and that's it, but post-Panda, should I do anything different?
Intermediate & Advanced SEO | | EricPacifico0