How to remove Duplicate content due to url parameters from SEOMoz Crawl Diagnostics
-
Hello all
I'm currently getting back over 8000 crawl errors for duplicate content pages . Its a joomla site with virtuemart and 95% of the errors are for parameters in the url that the customer can use to filter products.
Google is handling them fine under webmaster tools parameters but its pretty hard to find the other duplicate content issues in SEOMoz with all of these in the way.
All of the problem parameters start with
?product_type_
Should i try and use the robot.txt to stop them from being crawled and if so what would be the best way to include them in the robot.txt
Any help greatly appreciated.
-
Hi Tom
It took a while but I got there in the end. I was using joomla 1.5 and I downloaded a component called "tag meta" which allows you to insert tags including the canonical tag on specific urls or more importantly urls which begin in a certain way. Now how you use it depends on how your sef urls are set up or what sef component you are using but you can put a canonical tag on every url in a section that has view-all-products in it.
So in one of my examples I put a canonical tag pointing to /maternity-tops.html (my main category page for that section) on every url that began with /maternity-tops/view-all-products
I hope this if of help to you. It takes a bit of playing around with but it worked for me. The component also has fairly good documentation.
Regards
Damien
-
Damien,
Are you able to explain how you were able to do this within virtuemart?
Thanks
Tom
-
So leave the 5 pages of dresses as they are because they are all original but have the canonical tag on all of the filter parameters pointing to Page 1 of dresses.
Thank you for your help Alan
-
It should be on all versions of the page, all pointing to the one version.
Search engines will then see all as one page
-
Hi Alan
Thanks for getting back to me so fast. I'm slightly confused on this so an example might help One of the pages is http://www.funkybumpmaternity.com/Maternity-Dresses.html.
There are 5 pages of dresses with options on the left allowing you to narrow that down by color, brand, occasion and style. Every time you select an option on combination of options on the left for example red it will generate a page with only red dresses and a url of http://www.funkybumpmaternity.com/Maternity-Dresses/View-all-products.html?product_type_1_Colour[0]=Red&product_type_1_Colour_comp=find_in_set_any&product_type_id=1
The options available are huge which I believe is why i'm getting so many duplicate content content issues on SEOMoz pro. Google is handling the parameters fine.
How should I implement the canonical tag? Should I have a tag on all filter pages referencing page 1 of the dresses? Should pages 2-5 have the tag on them? If so would this mean that the dresses on these pages would not be indexed?
-
This sounds more like a case for a canonical tag,
dont exculed with robots.txt this is akin to cutting off your arm, because you have a spliter in your finger.
When you exclude use robots, link juce passing though links to these pages is lost.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
In Crawl Diagnostics, length of title element is incorrect
Hey all, It appears the Moz crawler is misreading the number of characters in my website's page titles. It shows 72 characters for the following page's title element: http://giavan.com/products/orange-crystal-chain-necklace-with-drop The page title for this web page is: Orange Crystal Chain Necklace with Drop | Giavan which is 48 characters. As it stands, this page title is displayed at 48 characters in Google SERPs. I am getting "This Element is Too Long" issue on 925 pages, which is just about the entire site. These issues appeared after I added additional Shopify (Liquid) code to the page title. If you inspect the code, you will see title element looks a bit odd with extra spacing and line breaks. What I'd like to know is whether or not it's necessary to rewrite the Shopify code, for SEM purposes. My feeling is that it's okay because the page titles look fine in SERPs but those 925 Moz crawl errors are kind of scary. Thanks for your help!
Moz Pro | | RichAlbanese0 -
Duplicate content in crawl despite canonical
Hi! I've had a bunch of duplicate content issues come up in a crawl, but a lot of them seem to have canonical tags implemented correctly. For example: http://www.alwayshobbies.com/brands/aztec-imports/-catg=Fireplaces http://www.alwayshobbies.com/brands/aztec-imports/-catg=Nursery http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables?page=0 Aztec http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables?page=1 Any ideas on what's happening here?
Moz Pro | | neooptic0 -
Is SeoMOZ Crawl Diagnostics wrong here?
We've been getting a ton of critical errors (about 80,000) in SeoMoz' Crawl Diagnostics saying we have duplicate content in our client's E-commerce site. Some of the errors are correct, but a lot of the pages are variations like: www.example.com/productlist?page=1 www.example.com/productlist?page=2 However, in our source code we have used rel="prev" and rel="next" so in my opinion we should be alright. Would love to hear from you if we have made a mistake or if it is an error in SeoMoz. Here's a full paste of the script:
Moz Pro | | Webdannmark0 -
What SeoMoz tool am I thinking of?
A few months ago I found a tool on Moz that did keyword link research. It is not keyword analysis either. It took a word and turned it into links such as dir:cabinet. I can't find it or remember or it. Does anyone else know? (I know this description sucks, sorry about that)
Moz Pro | | EcommerceSite0 -
Unable to crawl pages
Hi, I am trying to set up a campaign for our website - www.salvationarmy.org.au however, I can't seem to get a scan of more than three pages. I have tried the following: www.salvationarmy.org.au (only 2 pages) www.salvationarmy.org.au/home (only 1 page) salvationarmy.org.au (only 3 pages) There is a geo IP redirect on www.salvationarmy.org.au but the second domain listed above should resolve the full site. I'm a newbie to SEOmoz so any help would be appreciated! Thanks, Mel
Moz Pro | | KingPings0 -
Recent SEOMoz Crawl = Strange Results
Did anyone else get some really strange results in their weekly crawls this week with the campaign tool? Either my ranks sky rocked across three different sites or the tools is busted. Something to the tune of having 4 pages ranking in the top 30 to now having 15-16 pages ranking in the top 30. I'd love to find out it is just all the hard work paying off but i am worried it is the later. Regards - Kyle
Moz Pro | | kchandler0 -
How do I fix a duplicate content error with a top level domain?
Hi, I'm getting a duplicate content error from the SEOmoz crawler due to an issue with trailing slashes. It's showing www.milengo.com and www.milengo.com/ as having duplicate page titles. However I'm pretty sure this has been fixed in the .htaccess file since if you type in the domain with a trailing slash it automatically redirects to the domain without a trailing slash, so this shouldn't be an issue. I'm stuck here. Any ideas? Thanks. Rob
Moz Pro | | milengo0 -
Handling long URLs and overly-dynamic URLs on eCommerce site
Hello Forum, I've been optimizing an eCommerce site and our SEOmoz crawls are favorable for the most part, except for long URLs and overly-dynamic URLs. These issues stem from two URL types: Layered navigation (faceted search) and non-Google internal search results. I outline the issues for each below. We use an SEO-friendly URL structure for our product category pages, but once bots start "clicking" our layered navigation options, all the parameters are appended to our SEO-friendly urls, causing the SEOmoz crawl warnings. Layered Navigation :
Moz Pro | | pano
SEO-Friendly Category Page: oursite.com/shop/meditation-cushions.html Effects of layered navigation: oursite.com/shop/meditation-cushions.html?bolster_material_quality=414&bolsters_appearance=206&color=12&dir=asc&height=291&order=name As you can see the parameters include product attributes and page sorts. I should note that all pages generated by these parameters use the element to point back to the SEO-friendly URL We have also set up Google's Webmaster Tools to handle these parameters. Internal Search Function:
Our URLs start off simple: oursite.com/catalogsearch/result/?q=brown. Then the bot clicks all the layered navigation options, yielding oursite.com/catalogsearch/result/index/?appearance=54&cat=67&clothing_material=83&color=12&product_color=559&q=brown. Also, all search results are set to noindex,follow. My question is: Should we worry about these overly-dynamic and long ULR warnings? We have set up canonical elements, "noindex,follow" solutions, and configured Webmaster Tools to handle our parameters. If these are a concern, how would you resolve these issues?0