How to remove Duplicate content due to url parameters from SEOMoz Crawl Diagnostics
-
Hello all
I'm currently getting back over 8000 crawl errors for duplicate content pages . Its a joomla site with virtuemart and 95% of the errors are for parameters in the url that the customer can use to filter products.
Google is handling them fine under webmaster tools parameters but its pretty hard to find the other duplicate content issues in SEOMoz with all of these in the way.
All of the problem parameters start with
?product_type_
Should i try and use the robot.txt to stop them from being crawled and if so what would be the best way to include them in the robot.txt
Any help greatly appreciated.
-
Hi Tom
It took a while but I got there in the end. I was using joomla 1.5 and I downloaded a component called "tag meta" which allows you to insert tags including the canonical tag on specific urls or more importantly urls which begin in a certain way. Now how you use it depends on how your sef urls are set up or what sef component you are using but you can put a canonical tag on every url in a section that has view-all-products in it.
So in one of my examples I put a canonical tag pointing to /maternity-tops.html (my main category page for that section) on every url that began with /maternity-tops/view-all-products
I hope this if of help to you. It takes a bit of playing around with but it worked for me. The component also has fairly good documentation.
Regards
Damien
-
Damien,
Are you able to explain how you were able to do this within virtuemart?
Thanks
Tom
-
So leave the 5 pages of dresses as they are because they are all original but have the canonical tag on all of the filter parameters pointing to Page 1 of dresses.
Thank you for your help Alan
-
It should be on all versions of the page, all pointing to the one version.
Search engines will then see all as one page
-
Hi Alan
Thanks for getting back to me so fast. I'm slightly confused on this so an example might help One of the pages is http://www.funkybumpmaternity.com/Maternity-Dresses.html.
There are 5 pages of dresses with options on the left allowing you to narrow that down by color, brand, occasion and style. Every time you select an option on combination of options on the left for example red it will generate a page with only red dresses and a url of http://www.funkybumpmaternity.com/Maternity-Dresses/View-all-products.html?product_type_1_Colour[0]=Red&product_type_1_Colour_comp=find_in_set_any&product_type_id=1
The options available are huge which I believe is why i'm getting so many duplicate content content issues on SEOMoz pro. Google is handling the parameters fine.
How should I implement the canonical tag? Should I have a tag on all filter pages referencing page 1 of the dresses? Should pages 2-5 have the tag on them? If so would this mean that the dresses on these pages would not be indexed?
-
This sounds more like a case for a canonical tag,
dont exculed with robots.txt this is akin to cutting off your arm, because you have a spliter in your finger.
When you exclude use robots, link juce passing though links to these pages is lost.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Page Content, Indexing and Rel Canonical Just DOUBLED! Need Advice to Fix
Last Friday (Penguin 5/2.1) my website shot way off the grid and I noticed in my MOZ PRO Campaign dashboard that all of the following just doubled in numbers on my website: duplicate page content, Google indexing, and rel canonicals. I also noticed that some of my pages, images, tags and categories now added a /page/2/ or a -2. I just changed noindex for tags, but indexing for media, pages, posts, and categories. I'm currently using All In One SEO for a plugin. Any advice would be much appreciated as I'm stuck on the issue. relconical.png Duplicate-Page-Content.png [Duplicate Content II](Duplicate Content II) index1.png
Moz Pro | | CelebrityPersonalTrainer0 -
Crawl report - duplicate page title/content issue
When the crawl report is finished, it is saying that there are duplicate content/page titles issues. However there is a canonical tag that is formatted correctly so just wondered if this was a bug or if anyone else was having the same issues? For example, I'm getting a error warning for this page http://www.thegreatgiftcompany.com/categories/categories_travel?sort=name_asc&searchterm=&page=1&layout=table
Moz Pro | | KarlBantleman0 -
Crawl Diagnostics returning duplicate content based on session id
I'm just starting to dig into crawl diagnostics and it is returning quite a few errors. Primarily, the crawl is indicating duplicate content (page titles, meta tags, etc), because of a session id in the URL. I have set-up a URL parameter in Google Webmaster Tools to help Google recognize the existence of this session id. Is there any way to tell the SEOMoz spider the same thing? I'd like to get rid of these errors since I've already handled them for the most part.
Moz Pro | | csingsaas0 -
"Issue: Duplicate Page Content " in Crawl Diagnostics - but sample pages are not related to page indicated with duplicate content
In the crawl diagnostics for my campaign, the duplicate content warnings have been increasing, but when I look at the sample pages that SEOMoz says have duplicate content, they are completely different pages from the page identified. They have different Titles, Meta Descriptions and HTML content and often are different types of pages, i.e. product page appearing as having duplicate content vs. a category page. Anyone know what could be causing this?
Moz Pro | | EBCeller0 -
Crawl Diagnostics Report Lacks Information
When I look at the crawl diagnostics, SEOMoz tells me there are 404 errors. This is understandable, because some pages were removed. What this report doesn't tell me is how those pages were discovered. This is a very important piece of information, because it would tell me there are links pointing to those pages, either internal or external. I believe the internal links have been removed. If the report told me how if found the link, I would be able to take immediate action. Without that information, I have to go so a lot of investigation. And when you have a million pages, that isn't easy. Some possibilities: The crawler remembered the page from the previous crawl. There was a link from an index page - i.e. it is in the database still There was an individual link from another story - so now there are broken links Ditto, but it in on a static index page The link was from an external source - I need to make a redirect Am I missing something, or is this a feature the SEO Moz crawler doesn't have yet? What can I do (other than check all my pages) to discover this?
Moz Pro | | loopyal0 -
Duplicate content & canonicals
Hi, Working on a website for a company that works in different european countries. The setup is like this: www.website.eu/nl
Moz Pro | | nvs.nim
www.website.eu/be
www.website.eu/fr
... You see that every country has it's own subdir, but NL & BE share the same language, dutch... The copywriter wrote some unique content for NL and for BE, but it isn't possible to write unique for every product detail page because it's pretty technical stuff that goes into those pages. Now we want to add canonical tags to those identical product pages. Do we point the canonical on the /be products to /nl products or visa versa? Other question regarding SEOmoz: If we add canonical tags to x-pages, do they still appear in the Crawl Errors "duplicate page content", or do we have to do our own math and just do "duplicate page content" minus "Rel canonical" ?0 -
SEOMOZ reports the statistics.
SEOMOZ reports the Statistics, but where do i manage & improve??? Simply Statistics is all about SEOMOZ??
Moz Pro | | webicers0