Wordpress error
-
On our Google Webmaster Tools I'm getting a Severe Health Warning regarding our Robot.txt file reading:
User-agent: *
Crawl-delay: 20User-agent: 008
Disallow: /I'm wondering how I can fix this and stop it happening again.
The site was hacked about 4 months ago but I thought we'd managed to clear things up.
Colin
-
This will be my first post on SEOmoz so bear with me
The way I understand it is that robots read the robots.txt file from top to bottom, and once they find a rule that applies to them they stop reading and begin crawling. So basically the robots.txt written as:
User-agent:*
Disallow:
Crawl-delay: 20
User-agent: 008
Disallow: /
would not have the desired result as user-agent 008 would first read the top guideline:
User-agent: *
Disallow:
Crawl-delay: 20
and then begin crawling your site, as it is first being told that All user-agents are disallowed to crawl no pages or directories.
The corrected way to write this would be:
User-agent: 008
Disallow: /
User-agent: *
Disallow:
Crawl-delay: 20
-
Hi Peter,
I've tested the robot.txt file in Webmaster Tools and it now seems to be working as it should and it seems Google is seeing the same file as I have on the server.
I'm afraid this side of things isn't' my area of expertise so it's been a bit of a minefield.
I've taken a subscription with sucuri.net and taken various other steps that hopefully will hel;p with security. But who knows?
Thanks,
Colin
-
Google is seeing the same Robots.txt content (in GWT) that you show in the physical file, right? I just want to make sure that, when the site was hacked, no changes were made that are showing different versions of files to Google. It sounds like that's not the case here, but it definitely can happen.
-
Blog isn't' showing now and my hosts say that the index.php file is missing from the directory but I can see it.
Strange.
Have contacted them again to see what the problem can be.
Bit of a wasted Saturday!
-
Thanks Keith. Just contacting out hosts.
Nightmare!
-
Looks like a 403 permissions problem, that's a server side error... Make sure you have the correct permissions set on the blog folder in IIS Personally I always host on Linux...
-
Mind you the whole blog is now showing an error message and cant' be viewed so looks like an afternoon of trial and error!
-
Thanks very much Keith. I've just edited the file as suggested.
I see the error but as I am the web guy I cant' figure out how to get rid of it.
I think it might be a plugin that's causing it so I'm going to disable the and re-able them one as a time.
I've just PM'd you by the way.
Thanks for your help Keith.
Colin
-
Use this:
**User-agent: * Disallow: /blog/wp-admin/ Disallow: /blog/wp-includes/ Sitemap: http://nile-cruises-4u.co.uk/sitemap.xml**
Any FYI, you have the following error on your blog:
Warning: is_readable() [function.is-readable]: open_basedir restriction in effect. File(D:\home\nile-cruises-4u.co.uk\wwwroot\blog/wp-content/plugins/D:\home\nile-cruises-4u.co.uk\wwwroot\blog\wp-content\plugins\websitedefender-wordpress-security/languages/WSDWP_SECURITY-en_US.mo) is not within the allowed path(s): (D:\home\nile-cruises-4u.co.uk\wwwroot) in D:\home\nile-cruises-4u.co.uk\wwwroot\blog\wp-includes\l10n.php on line **339 **
Get your web guy to look at that, it appears at the top of every blog page for me...
Hope that helps,
Keith
-
Thanks Keith.
Only part of our site is WP based. Would that be a problem using the example you kindly suggested?
-
I gave you an example of a basic robots.txt file that I use on one of my Wordpress sites above, I would suggest using that for now.
I would not bother messing around with crawl delay in robots.txt as Peter said above there are better ways to achieve this... Plus I doubt you need it any way.
Google caches the robots.txt info for about 24hrs normally in my experience... So it's possible the old cached version is still being used by Google.
-
Hi Guys,
Thanks so much for your help. As you say Troy, that's defintely not what I want.
I assumed when we were hacked (twice in 8 months) that it might have been a competitor as we are in a very competitive niche. Might be very wrong there but we have certainly lost our top ranking on Google.co.uk for our main key phrases and our now at about position 7 for the same key phrases after about 3 years at number 1.
So when I saw on Google Webmaster Tools yesterday that we had a severe health warning and that the Googlebot was being prevented crawling our site I thought it might be the aftereffects of the hack.
Today even though I changed the robot.txt file yesterday GWT is showing 1000 pages with errors, 285 Access Denied and 719 Not Found and this message: Googlebot is blocked from http://nile-cruises-4u.co.uk/
I've just tested the robot.txt via GWT and now get this message:
AllowedDetected as a directory; specific files may have different restrictionsSo maybe the pages will be able to access by Googlebot shortly and the Access Denied message will disappear.I've chaged the robot.txt file to
User-agent: *
Crawl-delay: 20But should I change it to a better version? Sorry guys, I'm an online travel agent and not great on coding and really techie stuff. Although I'm learning pretty quickly about the bad stuff!I seem to have a few problems getting this sorted and wonder if this is a part of why our page position is dropping? -
I would simplify your robots.txt to read something like:
**User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Sitemap: http://www.your-domain.com/sitemap.xml**
-
That's odd: "008" appears to be the user agent for "80legs", a custom crawler platform. I'm seeing it in other Robots.txt files.
-
I'm not 100% sure what he's seeing, but when I plug his robots.txt into the robots analysis tool, I get this back:
Googlebot blocked by line 5: Disallow: /
Detected as a directory; specific files may have different restrictions
However, when I gave the top "**User-agent: ***" the "Disallow: " it seemed to fix the problem. Like, it didn't understand that the **Disallow: / **was meant only for the 008 user-agent?
-
Not honestly sure what User-agent "008" is, but that seems harmless. Why the crawl delay? There are better ways to handle that than Robots.txt, if a crawler is giving you trouble.
Was there a specific message/error in GWT?
-
I think, if you have a robots.txt reading what you show above:
User-agent: * Crawl-delay: 20
User-agent: 008 Disallow: /
That just basically says, "Don't crawl my site at all" (The "Disallow: /" means, I'm not allowing anything to be crawled by any search engine that pays attention to robots.txt at all)
So...I'm guessing that's not what you want?
(Bah..ignore. "User-agent". I'm a fool)
Actually, this seems to have solved your issue...make sure you explicitly tell all other User-agents that they are allowed:
User-agent: * Disallow: Crawl-delay: 20
User-agent: 008 Disallow: /
The extra "Disallow:" under User-agent: * says "I'm not going to disallow anything to most user-agents." Then the Disallow under user-agent 008 seems to only apply to them.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Process to move blog from subdomain on Wordpress, to subfolder on BigCommerce store
Hi Having weighed up all the angles, it's time to bite the bullet and move our blog from a subdomain to a subfolder on our ecommerce store. But as someone new to SEO I am struggling to find the correct process for doing this properly for our situation. Can anyone help? I have outlined what I have learned so far in 10 steps below to hopefully help you understand my situation, where I am at and what I am struggling with. Advice, tips and suggested further reading on all/any of the 10 points would be great. Some quick background The blog is on Wordpress, and on a subdomain of our store (blog.store.com). It is four years old, with 80 original posts we want to move to a subfolder of the store (store.com/blog). The store has been built using BigCommerce, and has also been active for four years. Both the blog and the store exist as properties within our Google Search Console. The 10 steps required for the move, based on research so far, and the associated questions 1 Prepare new site: which I am guessing means reproducing all of the content over at the new subfolder location (store.com/blog)? 2 Setup errors for any pages not being transferred: I have no idea how to do this! 3 Make sure analytics is working for the new pages: it should be as the both the site the pages are moving to a subfolder of is already running with analytics and has been for years - is this a safe assumption? 4 Map all URLs being moved to their new counterparts: is this just record keeping? In a spreadsheet? Or is it a process I don't yet understand?? 5 Add rel='cannonical' tags: while I understand the concept of these, I have no idea how to implement them properly here! 6 Create and save new sitemaps: as both the blog.store.com and store.com exist in Google Search Console already, can I just refresh the sitemap for store.com/blog once the subfolder is created to achive this? 7 Setup and test 301 redirects: these can be created in BigCommerce for the new pages in the store.com/blog subfolder, and will refer back to the blog.store.com URLs the pages came from - is this the right way to do this? I am still learning here and know enough to know how much this can matter, but not enough to fully grasp the intricacies of the process 8 Move URLs simultaneously: I have no idea what this means or how to achieve it! is this just for big site moves? Does it still apply to 80 blog posts shifting from a subdomain to a subfolder on the same root? If so, how? 9 Submit a change of address in Google Search Console: This looks simple enough although Google ominously warn: ‘Don't use this tool unless you are moving your primary website presence to a new address’ Which makes me wonder how simple it really is - my primary website in this case is the store, which is not moving. address But does 'primary' here simply mean the individual property with search console? I am going in circles on this one! 10 Configure the old blog on the subdomain to redirect people and engines to the new pages: I thought the 301 redirects and rel='cannonical' stuff did that already? What did I miss?? For anyone still here, thanks for making it this far and if you still have the energy left, any advice would be great! Thanks
Intermediate & Advanced SEO | | Warren_331 -
How to Handle a Soft 404 error to an admin page in WordPress
I'm seeing this error on Google Webmaster Console: | URL: | http://www.awlwildlife.com/wp-admin/admin-ajax.php | | | Error details | Linked from | | |
Intermediate & Advanced SEO | | aj613
| Last crawled: 11/15/16First detected: 11/15/16 The target URL doesn't exist, but your server is not returning a 404 (file not found) error. Learn more Your server returns a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404). This creates a poor experience for searchers and search engines. More information about "soft 404" errors | Any ideas what I should do about it? Thanks!0 -
I'm setting up my online store in wordpress/woocommerce and want to avoid duplicate content.
Hi Mozers, Apparently I'm using unique content in the short description area and it displays on the pages next to the product photo which is great how it is, but adding informational description repeating on every product page going to hurt us in SEO? A. See here an actual product - (flagged for thin content in OSE)
Intermediate & Advanced SEO | | melinmellow
B. This is how i would like to set each product page to improve them: See here a sample product with additional information/content.
Here's my question: Setting my product pages to the B version would be considered as duplicate content by google?0 -
Fix Google Index error
I changed my blog URL structure Can Someone please let me how to solve this?
Intermediate & Advanced SEO | | Michael.Leonard0 -
Can Wordpress plugins like WPLeadMagnet and PopUp Domination damage SEO? | Please give thumbs up here ➜
Hello guys, Can Wordpress plugins like WPLeadMagnet and PopUp Domination damage SEO? http://wpleadmagnet.com/ http://www.popupdomination.com/ I don't know what Googles thinks about using for example wpleadmagnet.com in order to catch exit & bounce traffic. What are you guys think? If i use wpleadmagnet.com can i expect to maintain my good positions in SERP or not?
Intermediate & Advanced SEO | | EestiKonto0 -
How Do I Generate a Sitemap for a Large Wordpress Site?
Hello Everyone! I am working with a Wordpress site that is in Google news (i.e. everyday we have about 30 new URLs to add to our sitemap) The site has years of articles, resulting in about 200,000 pages on the site. Our strategy so far has been use a sitemap plugin that only generates the last few months of posts, however we want to improve our SEO and submit all the URLs in our site to search engines. The issue is the plugins we've looked at generate the sitemap on-the-fly. i.e. when you request the sitemap, the plugin then dynamically generates the sitemap. Our site is so large that even a single request for our sitemap.xml ties up tons of server resources and takes an extremely long time to generate the sitemap (if the page doesn't time out in the process). Does anyone have a solution? Thanks, Aaron
Intermediate & Advanced SEO | | alloydigital0 -
Changing Wordpress Permalinks Bad For SEO?
Hi, I have various pages set up such as http://www.musicliveuk.com/home/wedding-singer. Is there any benefit in SEO terms to changing it to http://www.musicliveuk.com/live-music/wedding-singer? I would presume that a keyword would be better for SEO than 'home' which is irrelevant? Also if I were to change it would all the links external I have on other sites pointing to http://www.musicliveuk.com/home/wedding-singer be lost as the url no longer exists? I suppose I could set up a manual redirect from http://www.musicliveuk.com/home/wedding-singer to http://www.musicliveuk.com/live-music/wedding-singer or would wordpress automatically redirect from the old to new? By redirecting I understand that some 'link juice' is lost along the way so is including the keyword in the url of enough benefit to warrant losing some link juice? Finally if I do change the url to include the keyword how do I do it in wordpress? I can only see how to change the page title using the 'edit' button when editing a page?
Intermediate & Advanced SEO | | SamCUK1 -
How do I get rid of all the 404 errors in google webmaster tools after building a new website under the same domiain
I recently launched my new website under the same domain as the old one. I did all the important 301 redirects but it seems like every url that was in google index is still their but now with a 404 error code. How can I get rid of this problem? For example if you google my company name 'romancing diamonds' half the link under the name are 404 errors. Look at my webmaster tools and you'll see the same thing. Is their anyway to remove all those previous url's from google's indexes and start anew? Shawn
Intermediate & Advanced SEO | | Romancing0