Webmaster Crawl errors caused by Joomla menu structure.
-
Webmaster Tools is reporting crawl errors for pages that do not exist due to how my Joomla menu system works. Example, I have a menu item named "Service Area" that stores 3 sub items but no actual page for Service Area. This results in a URL like domainDOTcom/service-area/service-page.html
Because the Service Area menu item is constructed in a way that shows the bot it is a link, I am getting a 404 error saying it can't find domainDOTcom/service-area/ (The link is to "javasript:;") Note, the error doesn't say domainDOTcom/service-area/javascript:; it just says /service-area/
What is the best way to handle this? Can I do something in robots.txt to tell the bot that this /service-area/ should be ignored but any page after /service-area/ is good to go? Should I just mark them as fixed as it's really not a 404 a human will encounter or is it best to somehow explain this to the bot? I was advised on google forums to try this, but I'm nervous about it.
Disallow: /service-area/*
Allow: /service-area/summerlin-pool-service.
Allow: /service-area/north-las-vegas
Allow: /service-area/centennial-hills-pool-serviceI tried a 301 redirect of /service-area to home page but then it pulls that out of the url and my landing pages become 404's.
http://www.lvpoolcleaners.com/
Thanks for any advice!
Derrick
-
No problem Derrick, my pleasure.
Tom
-
Wow,
Tom, thank you for the amazingly complete and well articulated response. You, kind sir, are a interwebs Rock Star!
-
Hi Derrick,
if you wish to use robots.txt you could simply use:
Allow: /service-area/*
Disallow: /service-area/This will allow access to any child of /service-area/ but not /service-area/.
You could redirect this page to your homepage if you wished, and to stop children of this page being redirected you could use RedirectMatch instead of the Redirect directive and use a simple regular expression to only redirect if the URI ends with /service-area/, like this:
RedirectMatch 301 /service-area/?$ http://www.lvpoolcleaners.com/
The $ sign at the end signs that the apache should only redirect if the URI is ending in that pattern, and the ? after the trailing / allows the redirect to happen with or without the trailing slash.
But perhaps the simplest solution to this problem would be making your /service-area/ link point to '#' if the Joomla menu will allow it. This will append an empty anchor to the url, it will not refresh or redirect the page and anchors in URLs are not counted as duplicate URLs.
For human usability this would be the nicest way to interact with the menu, as you don't want a visitor being interrupted mid-way through their buying cycle by being sent back to the homepage when they didn't ask for it.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why does my site have so many crawl errors relating to the wordpress login / captcha page
Going through the crawl of my site, there were around 100 medium priority issues, such as title element too short, and duplicate page title, and around 80 high priority issues relating to duplicate page content - However every page listed with these issues was the site's wordpress login / captcha page. Does anyone know how to resolve this?
Technical SEO | | ZenyaS0 -
404 crawl errors ending with your domain name??
Hello, I have a crawl test with numerous 404 errors ending with my domain name..? Not sure what the cause is. Plugins? Ecommerce? I use Wordpress if that could lead to an answer. Thanks for your time. K
Technical SEO | | Hydraulicgirl0 -
Webmaster tools question
Hi all. I have a question regarding http vs https. I have an https site and was wondering how to tell google in Webmaster tools to combine and use https. I have setup all sites in Webmaster tools. Both www and non www for both http and https. I see where to set up the www vs the non www but don't quite understand how to do the https part. I want all traffic to: https://www-creative -technology-solutions.com Thanks
Technical SEO | | twoacejr0 -
Product Code Error in Volusion
I started working with about 800+ 404 errors in September after we migrated our site to Volusion 13. There is a recurring 404 error that I can't trace inside of our source code or in our Sitemap. I don't know what is causing this error so I have no way of knowing how to fix it. Tech support at Volusion has been less than helpful so any feed back would be appreciated. | http://www.apelectric.com/Generac-6438-Guardian-Series-11kW-p/{1} | The error is seemingly starting with the product code. The addendum at the end of the URL "p/" should be followed by the product code. In this example, 6438. Instead, the code is being automatically populated with %7B1%7D Has anyone else this issue with Volusion or does this look familiar across any other platform?
Technical SEO | | MonicaOConnor0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Rogue url foung in webmaster toos
Buon Giorno from 2 degrees C thick fog wetherby UK 🙂 On this site www.davidclick.com I ran a crawl test and came across a url that doesnt exist in my site, the findings are illustrated here: http://i216.photobucket.com/albums/cc53/zymurgy_bucket/rogue-urlcopy_zps6c58ee46.jpg The plot thickens... the source of the referring traffic to a page that doesnt exist can be seen here: http://i216.photobucket.com/albums/cc53/zymurgy_bucket/rogue-link-source_zpsc70a34fc.jpg My intial thoughts rae to disavow via this tool:
Technical SEO | | Nightwing
https://www.google.com/webmasters/tools/disavow-links-main So my question is please: Is this sinister or should I just sit back drink a cup of horlicks and return to a Zen like status of inner peace? Any insights welcome 😉 Grazie tanto, David0 -
Help with strange 404 Errors.
For the most part I have never had trouble tracking down 404's. Usually it's simply a broken link, but lately I have been getting these strange errors http://gridironexperts.com/http%3A/www.nfl.com/gamecenter?game_id=29528&season=2008&displayPage=tab_gamecenter/ What does; %C2%94 repersent? The error always points to NFL.com, but we don't link to them...like ever? Can I just 404: http://gridironexperts.com// to fix the problem, as all 404's start with this weird %C2%94 error. Is this error even on my site? Is in the backend...virus? thanks -Mike
Technical SEO | | MikePatch0 -
Google webmaster tools
I have linked webmaster tools to Google analytics account. My question is where can i see Webmaster reports in Google analytics ?
Technical SEO | | seoug_20050