Need Help With Robots.txt on Magento eCommerce Site
-
Hello, I am having difficulty getting my robots.txt file to be configured properly. I am getting error emails from Google products stating they can't view our products because they are being blocked, and this past week, in my SEO dashboard, the URL's receiving search traffic dropped by almost 40%.
Is there anyone that can offer assistance on a good template robots.txt file I can use for a Magento eCommerce website?
The one I am currently using was found at this site here: e-commercewebdesign.co.uk/blog/magento-seo/magento-robots-txt-seo.php - However, I am getting problems from Google now because of it.
I searched and found this thread here: http://www.magentocommerce.com/wiki/multi-store_set_up/multiple_website_setup_with_different_document_roots#the_root_folder_robots.txt_file - But I felt like maybe I should get some additional help on properly configuring a robots for a Magento site.
Thanks in advance for any help. Please, let me know if you need more info to provide assistance.
-
You better back up your DB before doing that. Anyway, take a look at this MagentoConnect extension http://www.magentocommerce.com/magento-connect/MageWorx.com/extension/2852/seo-suite-enterprise#overview
or this one (it's by the same company
http://www.mageworx.com/seo-suite-pro-magento-extension.html
-
Thank you very much. We'll give that a shot and see how it goes. What started us tinkering with the robots file in the first place is that Bing Shopping told us it couldn't crawl our product images. Plus, our pdf files for product specs and manuals are all listed within the media folder. Do you have a suggestion for this? I would think we would get rid of "Disallow: /media/" and replace it with the following (what do you think?):
Disallow: /media/aitmanufacturers/
Disallow: /media/bigtom_media/
Disallow: /media/css/
Disallow: /media/downloadable/
Disallow: /media/easybanner/
Disallow: /media/geoip/
Disallow: /media/icons/
Disallow: /media/import/
Disallow: /media/js/
Disallow: /media/productsfeed/
Disallow: /media/sales/
Disallow: /media/tmp/
Disallow: /media/UPS/ -
Hello,
Below is what I use. You need to have the modrewrite enabled if you are going to disallow index.php and even then it's still very risky. This may be part of the issue. Robots.txt is so important, but you need to know what you are doing. Especially when disallowing as much as that UK site is.
Tyler
User-agent: *
Disallow: /*?
Disallow: /*.js$
Disallow: /*.css$
Disallow: /checkout/
Disallow: /catalogsearch/
Disallow: /review/
Disallow: /app/
Disallow: /downloader/
Disallow: /images/
Disallow: /js/
Disallow: /lib/
Disallow: /media/
Disallow: /*.php$
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /var/
Disallow: /customer/
Disallow: /enable-cookies/
Sitemap: http://domain.com/sitemap.xml
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need Advice on Categorizing Posts, Using Topics, Site Navigation & Structure
Hey there, My site had terrible categorization. I did a redesign, and essentially decided to start over using Topics instead of categories - which appear as my site's main navigation. Now I need to assign a Topic to all my posts. Is it safe to assign posts to multiple parent Topics from an SEO point of view? I want to do it since it would be helpful for users to find them in multiple locations some of the time, but I certainly don't want any SEO issues. Also, should I de-categorize all of my posts since I'm assigning them to my new hierarchical taxonomy - Topics? This is very important to finalize. Any help or advice is greatly appreciated. Thanks, Mike
Technical SEO | | naturalsociety0 -
Help! added WWW to wordpress site and now lost SEO ranking
Hi there Everyone! I have recently added a www to my wordpress website by going to settings>general and adding "www" to the wordpress address and the site address. After i did that i lost my SEO ranking, and also that MOZ is detecting that im getting site crawl issues stating that i have duplicate pages. an example is below: http://www.mefco.co.nz/news http://mefco.co.nz/news any ideas on how to fix this?
Technical SEO | | jfactor0 -
Redirecting old html site to new wordpress site
Hi I'm currently updating an old (8 years old) html site to wordpress and about a month ago I redirected some url's to the new site (which is in a directory) like this... Redirect 301 /article1.htm http://mysite.net/wordpress/article1/
Technical SEO | | briandee
Redirect 301 /article2.htm http://mysite.net/wordpress/article2/
Redirect 301 /article3.htm http://mysite.net/wordpress/article3/ Google has indexed these new url's and they are showing in search results. I'm almost finished the new version of site and it is currently in a directory /wordpress I intend to move all the files from the directory to the root so new url when this is done will be http://mysite.net/article1/ etc My question is - what to I do about the redirects which are in place - do I delete them and replace with something like this? Redirect 301 /wordpress/article1/ http://mysite.net/article1/
Redirect 301 /wordpress/article2/ http://mysite.net/article2/
Redirect 301 /wordpress/article3/ http://mysite.net/article3/ Appreciate any help with this0 -
Good robots txt for magento
Dear Communtiy, I am trying to improve the SEO ratings for my website www.rijwielcashencarry.nl (magento). My next step will be implementing robots txt to exclude some crawling pages.
Technical SEO | | rijwielcashencarry040
Does anybody have a good magento robots txt for me? And what need i copy exactly? Thanks everybody! Greetings, Bob0 -
Wrapping my head around an e-commerce anchor filter issue, need help
I am having a hard time understanding how Google will deal with this scenario, I would love to hear what you guys think or suggest. Ok a category page on the site in question looks like this. http://makeupaddict.me/6-skin-care All fine and well, But a paginated page or a filtered category pages look like these http://makeupaddict.me/6-skin-care#/page-2 and http://makeupaddict.me/6-skin-care#/price-391-1217 From my understanding Google does not index an anchor without a shebang (#!), but that doesn't mean that they do not still crawl them, correct? That is where the issue comes in, since anchors are not indexed and dropped from the urls, when Google crawls a filtered or paginated page, it is getting different results. From the best of my understanding, and someone can correct me if I am wrong but an anchor is not passed in web languages like a querystring is. So if I am using php and land on http://makeupaddict.me/6-skin-care or http://makeupaddict.me/6-skin-care#/price-391-1217 and use something like .$_SERVER['SELF'] to get the url both pages will return http://makeupaddict.me/6-skin-care since the anchor is handled client side. With that being the case, is it imagined that Google uses that standard or is it thought they have a custom function that grabs the whole url anchor in all? Also if they are crawling the page with the anchor, but seeing it anchor less how are they handling the changing content?
Technical SEO | | LesleyPaone0 -
GWT returning 200 for robots.txt, but it's actually returning a 404?
Hi, Just wondering if anyone has had this problem before. I'm just checking a client's GWT and I'm looking at their robots.txt file. In GWT, it's saying that it's all fine and returns a 200 code, but when I manually visit (or click the link in GWT) the page, it gives me a 404 error. As far as I can tell, the client has made no changes to the robots.txt recently, and we definitely haven't either. Has anyone had this problem before? Thanks!
Technical SEO | | White.net0 -
HTTP 500 Internal Server Error, Need help
Hi, For a few days know google crawlers have been getting 500 errors from our dedicated server whenever they try to crawl the site. Using the "Fetch as Google" tool under health in webmaster tools, I get "Unreachable page" every time I fetch the homepage. Here is exactly what the google crawler is getting: <code>HTTP/1.1 500 Internal Server Error Date: Fri, 21 Jun 2013 19:52:27 GMT Server: Apache/2.2.15 (CentOS) X-Powered-By: PHP/5.3.3 X-Pingback: [http://www.communityadvocate.com/xmlrpc.php](http://www.communityadvocate.com/xmlrpc.php) Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> My url is [http://www.communityadvocate.com](http://www.communityadvocate.com/)</code> and here's the screenshot from Goolge webmater http://screencast.com/t/FoWvqRRtmoEQ How can i fix that? Thank you
Technical SEO | | Vmezoz0 -
Need Help with MAGENTO - URL rewrite
Hello... Hopefully a Magento expert will stumble across this question and help me out. I have noticed that my site is no longer as prominent as it once was for specific product pages... I am looking for help in rewriting the URL's for the product pages. I want it to have xyz.com/product (which exists if you hard code it into the site) If you wind up on the product by clicking throught the categories the url looks like: xyz.com/category/subcategory/product. Does anyone know how to make it so when you land on a product page it is just xyz.com/product ? My Site is : http://goo.gl/JgK1e Thanks for the help...
Technical SEO | | Prime850