Drupal infinite URL depth? SEOMOZ treating as duplicate content
-
I'm monitoring a subdirectory of my site on SEOMOZ but with catastrophic results. It's finding infinite duplicate content e.g.www.example.co.uk/product/samples/product/product/productand so on...
The website is running on Drupal. Do you have any ideas on how I can solve this?
-
I'm having this same issue with a new drupal site. Does anyone know the underlying cause and how to fix it.
Would any relative path cause this?
Thanks.
-
Can you list the modules you're running? What e-Commerce module are you running?
-
I'm not a Drupal expert, but it sounds like you may have some kind of relative path that's getting perpetuated. Robots.txt could help as a patch, but I'd definitely want to solve the crawl problem, as this could spin out into other problems.
Have you tried a desktop crawler, like Xenu or Screaming Frog? Sorry, it's tough to diagnose without seeing the actual site, but it's almost got to be a relative path that's causing "/product" to keep being added to links.
-
Yes, anything deeper would also be blocked.
-
Thanks Scott, this is really helpful.
Out of interest, would disallowing '/product/samples/product' automatically stop the bots from indexing all the pages underneath this, too such as '/product/samples/product/product/product/'?
-
Try adding something like this to your robots.txt file:
User-agent: rogerbot
Disallow: /product/samples/product/
Disallow: /product/samples2/product1/
Disallow: /product/samples3/product4/etc...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Footer Content Issue
Please check given screenshot URL. As per the screenshot we are using highlighted content through out the website in the footer section of our website (https://www.mastersindia.co/) . So, please tell us how Google will treat this content. Will Google count it as duplicate content or not? What is the solution in case if the Google treat it as duplicate content. Screenshot URL: https://prnt.sc/pmvumv
Technical SEO | | AnilTanwarMI0 -
Duplicate Content in Wordpress.com
Hi Mozers! I have a client with a blog on wordpress.com. http://newsfromtshirts.wordpress.com/ It just had a ranking drop because of a new Panda Update, and I know it's a Dupe Content problem. There are 3900 duplicate pages, basically because there is no use of noindex or canonical tag, so archives, categories pages are totally indexed by Google. If I could install my usual SEO plugin, that would be a piece of cake, but since Wordpress.com is a closed environment I can't. How can I put a noindex into all category, archive and author peges in wordpress.com? I think this could be done by writing a nice robot.txt, but I am not sure about the syntax I shoud use to achieve that. Thank you very much, DoMiSol Rossini
Technical SEO | | DoMiSoL0 -
Duplicate content
I'm getting an error showing that two separate pages have duplicate content. The pages are: | Help System: Domain Registration Agreement - Registrar Register4Less, Inc. http://register4less.com/faq/cache/11.html 1 27 1 Help System: Domain Registration Agreement - Register4Less Reseller (Tucows) http://register4less.com/faq/cache/7.html | These are both registration agreements, one for us (Register4Less, Inc.) as the registrar, and one for Tucows as the registrar. The pages are largely the same, but are in fact different. Is there a way to flag these pages as not being duplicate content? Thanks, Doug.
Technical SEO | | R4L0 -
Issue: Duplicate Pages Content
Hello, Following the setting up of a new campaign, SEOmoz pro says I have a duplicate page content issue. It says the follwoing are duplicates: http://www.mysite.com/ and http://www.mysite.com/index.htm This is obviously true, but is it a problem? Do I need to do anything to avoid a google penalty? The site in question is a static html site and the real page only exsists at http://www.mysite.com/index.htm but if you type in just the domain name then that brings up the same page. Please let me know what if anything I need to do. This site by the way, has had a panda 3.4 penalty a few months ago. Thanks, Colin
Technical SEO | | Colski0 -
How do I fix duplicate content with the home page?
This is probably SEO 101, but I'm unsure what to do here... Last week my weekly crawl diagnostics were off the chart because http:// was not resolving to http://www...fixed that but now it's saying I have duplicate content on: http://www.......com http://www.......com/index.php How do I fix this? Thanks in advance!
Technical SEO | | jgower0 -
Seomoz is showing duplicate page content for my wordpress blog
Hi Everyone, My seomoz crawl diagnostics is indicating that I have duplicate content issues in the wordpress blog section of my site located at: http://www.cleversplash.com/blog/ What is the best strategy to deal with this? Is there a plugin that can resolve this? I really appreciate your help guys. Martin
Technical SEO | | RogersSEO0 -
Duplicate content
I am getting flagged for duplicate content, SEOmoz is flagging the following as duplicate: www.adgenerator.co.uk/ www.adgenerator.co.uk/index.asp These are obviously meant to be the same path so what measures do I take to let the SE's know that these are to be considered the same page. I have used the canonical meta tag on the Index.asp page.
Technical SEO | | IPIM0 -
URL Duplicate Content Issues (Website Transition)
Hey guys, I just transitioned my website and I have a question. I have built up all the link juice around my old url styles. To give you some clarity: My old CMS rendered links like this: www.example.com/sweatbands My new CMS renders links like this: www.example.com/sweatbands/ My new CMS's auto-sitemap also generates them with the slash on the end. Also throughout the website the CMS links to them with the slash at the end and i link to them without the slash (because it's what i am used to). I have the canonical without the slash. Should I just 301 to the version with the slash before google crawls again? I'm worried that i'll lose all the trust and ranking i built up to the one without the slash. I rank very high for certain keywords and some pages house a large portion of our traffic. What a mess! Help! 🙂
Technical SEO | | Hyrule0