How to remove hundreds of duplicate pages
-
Hi - while i was checking duplicate links, am finding hundreds of duplicates pages :-
-
having undefined after domain name and before sub page url
-
having /%5C%22/ after domain name and before the sub page url
-
Due to Pagination limits
Its a joomla site - http://www.mycarhelpline.com
Any suggestions - shall we use:-
-
301 redirect
-
leave these as standdstill
-
and what to do of pagination pages (shall we create a separate title tag n meta description of every pagination page as unique one)
thanks
-
-
Okay, I took a look at the plugin Ben recommended, and took another look at the SH404SEF one. The free one Ben recommended (http://extensions.joomla.org/extensions/site-management/sef/1063) looks like it can help out with some duplicate content - but what I recommend doing is getting the SH404SEF here http://anything-digital.com/sh404sef/features.html because it allows for setting up canonical tags and also gives you the option to add the rel=next feature to your paginated pages, which is one of your problem areas.
One thing I noticed though is that it specifically states it "automatically adds canonical tags to non-html pages" - so that means it will apply it automatically to Joomla's defaul pdf view, etc. While this is helpful, it may not solve the full issue of your duplicate pages with the undefined and "/%5C%22/" issue.
It does however state that it "removes duplicate URLs" - how it identifies and removes these, I am not sure. You may want to try it out because it is useful for other optimization tasks - or contact the owner for more information.
If the tool doesn't recognize and remove the duplicate pages caused by /undefined/ and "/%5C%22/" then you should disallow crawling of these in your robots.txt file. While you are in your robots.txt file you should remove the /images/ because you want those to be crawled - Joomla adds that in by default.
Because a lot of these pages have already been crawled, you should do a 301 on the duplicate pages to their matching page. This sounds like it will be a long process - this may be aided by the sh404sef plugin - not sure.
I just want to also add that I am in no way affiliated with any of these plugins.
-
The only way to solve the duplication error you are getting is to make the URL's distinct. Googlebot comes to your site and looks at the URL's and if they are not distinct it may not index them very well. I understand your site is showing up fine in the SERP's so this may be one of those items you place on a lower priority until later.
I think R.May knows Joomla so I'll refer to him on how to accomplish this but it may be worth it to make the adjustment. You may find the end result of making your page URL more distinct will actually increase your current SERPs. Just a thought.
Other than that. If your site isn't hurting and the only thing you are concerned about is the report in SEOmoz then I would move on and just make a mental note of it for later.
-
Hi ben - changing url is not well required as the site is getting good serp, however - the duplicacy issue to saveguard us from any future issue - is what we seek for
-
Hi - thanks for replying
-
For the dynamic url - Yes - at the initial start - it was missed and as on its not reqd somehow - as the pages are getting indexed well and good in SERP
-
For pagination - Where we needs this is like in our used car section, discount section& news section where multiple pages are created. shall we create separate title & meta description for every pagination page. is it ideally reqd ?
http://www.mycarhelpline.com/index.php?option=com_usedcar&view=category&Itemid=3
- 'undefined' & /%5C%22/ is coming as per report of SEOmoz and is almost on every page of site (except of home page) with the dynamic url after domain name are preceded with these 2 strings as per moz report
how to get this corrected - want to be preventive from this duplicacy n avoid getting a hit in future even if its going well now -
-
-
I'm not a Joomla expert but to make your URL's search engine friendly you are going to need to add an extension like this. That will allow you to make more distinct URLs that will not be considered "duplicate" anymore.
-
Joomla has soo many dup content issues, you have to know Joomla really well to avoid most of them. The biggest issue is you didn't enable the SEF URLs from the start and left the default index.php?option=com on most of them, which stuffs your URLs full of ugly parameters.
You can still enable this in your global options and with a quick edit to htaccess - but it will change all of your current URLs and you will need to 301 all of them, so that isn't a great option unless you are really suffering - and depending on if you are using J 1.6 or under, this is a time consuming nasty process. Also this is unlikely to get rid of any existing duplicate pages - but may make dealing with them and finding them easier.
I don't see the specific examples you posted though, where are you seeing "undefined" and "%5C%22/ " ?
You should implement rel=canonical on the correct version of each page. I recommend SH404SEF which is a Joomla plugin and makes this process easier - but it isn't free. I don't know of a good free plugin that does this, and Joomla's templates make doing this manually difficult.
Looking at it quickly, I also didn't notice any articles that were paginated, but you should try to follow the rel="next" and rel="prev" for paginated pages. This is likely something you will have to edit your Joomla core files to do.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO Impact & Google Impact On Removing Product From Category Page for Ecommerce Site
Hello Experts, For my Ecommerce site previously I was showing products at category pages i.e. first all subcategories name after that list all products of all subcateogries. That also approx per category 500 products via load more feature. My query is now I am planning to show products only at Product Listing Page and not on Category pages so what will be SEO impact and how google will treat this? Thanks!
Intermediate & Advanced SEO | | Johny123450 -
I've got duplicate pages. For example, blog/page/2 is the same as author/admin/page/2\. Is this something I should just ignore, or should I create the author/admin/page2 and then 301 redirect?
I'm going through the crawl report and it says I've got duplicate pages. For example, blog/page/2 is the same as author/admin/page/2/ Now, the author/admin/page/2 I can't even find in WordPress, but it is the same thing as blog/page/2 nonetheless. Is this something I should just ignore, or should I create the author/admin/page2 and then 301 redirect it to blog/page/2?
Intermediate & Advanced SEO | | shift-inc0 -
404 Pages. Can I change it to do this without getting penalized ? I want to lower our bounce rate from these pages to encourage the user to continue on the site
Hi All, We have been streaming our site and got rid of thousands of pages for redundant locations (Basically these used to be virtual locations where we didn't have a depot although we did deliver there and most of them was duplicate/thin content etc ). Most of them have little if any link value and I didn't want to 301 all of them as we already have quite a few 301's already We currently display a 404 page but I want to improve on this. Current 404 page is - http://goo.gl/rFRNMt I can get my developer to change it, so it will still be a 404 page but the user will see the relevant category page instead ? So it will look like this - http://goo.gl/Rc8YP8 . We could also use Java script to show the location name etc... Would be be okay ? or would google see this as cheating. basically I want to lower our bounce rates from these pages but still be attractive enough for the user to continue in the site and not go away. If this is not a good idea, then any recommendations on improving our current 404 would be greatly appreciated. thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Duplicate Page Content - Shopify
Moz reports that there are 1,600+ pages on my site (Sportiqe.com) that qualify as Duplicate Page Content. The website sells licensed apparel, causing shirts to go into multiple categories (ie - LA Lakers shirts would be categorized in three areas: Men's Shirts, LA Lakers Shirts and NBA Shirts)It looks like "tags" are the primary cause behind the duplicate content issues: // Collection Tags_Example: : http://www.sportiqe.com/collections/la-clippers-shirts (Preferred URL): http://www.sportiqe.com/collections/la-clippers-shirts/la-clippers (URL w/ tag): http://sportiqe.com/collections/la-clippers-shirts/la-clippers (URL w/ tag, w/o the www.): http://sportiqe.com/collections/all-products/clippers (Different collection, w/ tag and same content)// Blog Tags_Example: : http://www.sportiqe.com/blogs/sportiqe/7902801-dispatch-is-back: http://www.sportiqe.com/blogs/sportiqe/tagged/elias-fundWould it make sense to do 301 redirects for the collection tags and use the Parameter Tool in Webmaster Tools to exclude blog post tags from their crawl? Or, is there a possible solution with the rel=cannonical tag?Appreciate any insight from fellow Shopify users and the Moz community.
Intermediate & Advanced SEO | | farmiloe0 -
On Page vs Off Page - Which Has a Greater Effect on Rankings?
Hi Mozzers, My site will be migrating to a new domain soon, and I am not sure how to spend my time. Should I be optimizing our content for keywords, improving internal linking, and writing new content - or should I be doing link building for our current domain (or the new one)? Is there a certain ratio that determines rankings which can help me prioritize these to-dos?, such as 70:30 in favor of link-building? Thanks for any help you can offer!
Intermediate & Advanced SEO | | Travis-W0 -
How to remove duplicate content, which is still indexed, but not linked to anymore?
Dear community A bug in the tool, which we use to create search-engine-friendly URLs (sh404sef) changed our whole URL-structure overnight, and we only noticed after Google already indexed the page. Now, we have a massive duplicate content issue, causing a harsh drop in rankings. Webmaster Tools shows over 1,000 duplicate title tags, so I don't think, Google understands what is going on. <code>Right URL: abc.com/price/sharp-ah-l13-12000-btu.html Wrong URL: abc.com/item/sharp-l-series-ahl13-12000-btu.html (created by mistake)</code> After that, we ... Changed back all URLs to the "Right URLs" Set up a 301-redirect for all "Wrong URLs" a few days later Now, still a massive amount of pages is in the index twice. As we do not link internally to the "Wrong URLs" anymore, I am not sure, if Google will re-crawl them very soon. What can we do to solve this issue and tell Google, that all the "Wrong URLs" now redirect to the "Right URLs"? Best, David
Intermediate & Advanced SEO | | rmvw0 -
Pricing Page vs. No Pricing Page
There are many SEO sites out there that have an SEO Pricing page, IMO this is BS. A SEO company cannot give every person the same quote for diffirent keywords. However, this is something we are currently debating. I don't want a pricing page, because it's a page full of lies. My coworker thinks it is a good idea, and that users look for a pricing page. Suggestions? If I had to build one (which I am debating against) is it better to just explain why pricing can be tricky? or to BS them like most sites do?
Intermediate & Advanced SEO | | SEODinosaur0 -
Removing pages from index
Hello, I run an e-commerce website. I just realized that Google has "pagination" pages in the index which should not be there. In fact, I have no idea how they got there. For example, www.mydomain.com/category-name.asp?page=3434532
Intermediate & Advanced SEO | | AlexGop
There are hundreds of these pages in the index. There are no links to these pages on the website, so I am assuming someone is trying to ruin my rankings by linking to the pages that do not exist. The page content displays category information with no products. I realize that its a flaw in design, and I am working on fixing it (301 none existent pages). Meanwhile, I am not sure if I should request removal of these pages. If so, what is the best way to request bulk removal. Also, should I 301, 404 or 410 these pages? Any help would be appreciated. Thanks, Alex0