Wordpress error
-
On our Google Webmaster Tools I'm getting a Severe Health Warning regarding our Robot.txt file reading:
User-agent: *
Crawl-delay: 20User-agent: 008
Disallow: /I'm wondering how I can fix this and stop it happening again.
The site was hacked about 4 months ago but I thought we'd managed to clear things up.
Colin
-
This will be my first post on SEOmoz so bear with me
The way I understand it is that robots read the robots.txt file from top to bottom, and once they find a rule that applies to them they stop reading and begin crawling. So basically the robots.txt written as:
User-agent:*
Disallow:
Crawl-delay: 20
User-agent: 008
Disallow: /
would not have the desired result as user-agent 008 would first read the top guideline:
User-agent: *
Disallow:
Crawl-delay: 20
and then begin crawling your site, as it is first being told that All user-agents are disallowed to crawl no pages or directories.
The corrected way to write this would be:
User-agent: 008
Disallow: /
User-agent: *
Disallow:
Crawl-delay: 20
-
Hi Peter,
I've tested the robot.txt file in Webmaster Tools and it now seems to be working as it should and it seems Google is seeing the same file as I have on the server.
I'm afraid this side of things isn't' my area of expertise so it's been a bit of a minefield.
I've taken a subscription with sucuri.net and taken various other steps that hopefully will hel;p with security. But who knows?
Thanks,
Colin
-
Google is seeing the same Robots.txt content (in GWT) that you show in the physical file, right? I just want to make sure that, when the site was hacked, no changes were made that are showing different versions of files to Google. It sounds like that's not the case here, but it definitely can happen.
-
Blog isn't' showing now and my hosts say that the index.php file is missing from the directory but I can see it.
Strange.
Have contacted them again to see what the problem can be.
Bit of a wasted Saturday!
-
Thanks Keith. Just contacting out hosts.
Nightmare!
-
Looks like a 403 permissions problem, that's a server side error... Make sure you have the correct permissions set on the blog folder in IIS Personally I always host on Linux...
-
Mind you the whole blog is now showing an error message and cant' be viewed so looks like an afternoon of trial and error!
-
Thanks very much Keith. I've just edited the file as suggested.
I see the error but as I am the web guy I cant' figure out how to get rid of it.
I think it might be a plugin that's causing it so I'm going to disable the and re-able them one as a time.
I've just PM'd you by the way.
Thanks for your help Keith.
Colin
-
Use this:
**User-agent: * Disallow: /blog/wp-admin/ Disallow: /blog/wp-includes/ Sitemap: http://nile-cruises-4u.co.uk/sitemap.xml**
Any FYI, you have the following error on your blog:
Warning: is_readable() [function.is-readable]: open_basedir restriction in effect. File(D:\home\nile-cruises-4u.co.uk\wwwroot\blog/wp-content/plugins/D:\home\nile-cruises-4u.co.uk\wwwroot\blog\wp-content\plugins\websitedefender-wordpress-security/languages/WSDWP_SECURITY-en_US.mo) is not within the allowed path(s): (D:\home\nile-cruises-4u.co.uk\wwwroot) in D:\home\nile-cruises-4u.co.uk\wwwroot\blog\wp-includes\l10n.php on line **339 **
Get your web guy to look at that, it appears at the top of every blog page for me...
Hope that helps,
Keith
-
Thanks Keith.
Only part of our site is WP based. Would that be a problem using the example you kindly suggested?
-
I gave you an example of a basic robots.txt file that I use on one of my Wordpress sites above, I would suggest using that for now.
I would not bother messing around with crawl delay in robots.txt as Peter said above there are better ways to achieve this... Plus I doubt you need it any way.
Google caches the robots.txt info for about 24hrs normally in my experience... So it's possible the old cached version is still being used by Google.
-
Hi Guys,
Thanks so much for your help. As you say Troy, that's defintely not what I want.
I assumed when we were hacked (twice in 8 months) that it might have been a competitor as we are in a very competitive niche. Might be very wrong there but we have certainly lost our top ranking on Google.co.uk for our main key phrases and our now at about position 7 for the same key phrases after about 3 years at number 1.
So when I saw on Google Webmaster Tools yesterday that we had a severe health warning and that the Googlebot was being prevented crawling our site I thought it might be the aftereffects of the hack.
Today even though I changed the robot.txt file yesterday GWT is showing 1000 pages with errors, 285 Access Denied and 719 Not Found and this message: Googlebot is blocked from http://nile-cruises-4u.co.uk/
I've just tested the robot.txt via GWT and now get this message:
AllowedDetected as a directory; specific files may have different restrictionsSo maybe the pages will be able to access by Googlebot shortly and the Access Denied message will disappear.I've chaged the robot.txt file to
User-agent: *
Crawl-delay: 20But should I change it to a better version? Sorry guys, I'm an online travel agent and not great on coding and really techie stuff. Although I'm learning pretty quickly about the bad stuff!I seem to have a few problems getting this sorted and wonder if this is a part of why our page position is dropping? -
I would simplify your robots.txt to read something like:
**User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Sitemap: http://www.your-domain.com/sitemap.xml**
-
That's odd: "008" appears to be the user agent for "80legs", a custom crawler platform. I'm seeing it in other Robots.txt files.
-
I'm not 100% sure what he's seeing, but when I plug his robots.txt into the robots analysis tool, I get this back:
Googlebot blocked by line 5: Disallow: /
Detected as a directory; specific files may have different restrictions
However, when I gave the top "**User-agent: ***" the "Disallow: " it seemed to fix the problem. Like, it didn't understand that the **Disallow: / **was meant only for the 008 user-agent?
-
Not honestly sure what User-agent "008" is, but that seems harmless. Why the crawl delay? There are better ways to handle that than Robots.txt, if a crawler is giving you trouble.
Was there a specific message/error in GWT?
-
I think, if you have a robots.txt reading what you show above:
User-agent: * Crawl-delay: 20
User-agent: 008 Disallow: /
That just basically says, "Don't crawl my site at all" (The "Disallow: /" means, I'm not allowing anything to be crawled by any search engine that pays attention to robots.txt at all)
So...I'm guessing that's not what you want?
(Bah..ignore. "User-agent". I'm a fool)
Actually, this seems to have solved your issue...make sure you explicitly tell all other User-agents that they are allowed:
User-agent: * Disallow: Crawl-delay: 20
User-agent: 008 Disallow: /
The extra "Disallow:" under User-agent: * says "I'm not going to disallow anything to most user-agents." Then the Disallow under user-agent 008 seems to only apply to them.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does changing template for a wordpress site affect SEO
Hi I work for an Inventory Management Software company and we already have a WordPress site but I am currently working on re-designing of our WordPress site and in this process, we are looking for moving to a new template. I want to know what will be the impact on SEO performance while taking a shift to a new template.
Intermediate & Advanced SEO | | Cin7_Marketing0 -
How to add dropdown menus in the Wordpress Cutline theme?
I've got a fan website for a cult US TV show (that also sells merchandise as an affiliate). The site's design is about five years old, and I should probably revamp it at some point: http://www.btvsonline.com/ Anyway, the site's Cutline theme (http://cutline.tubetorial.com/) does not support native headers, and the header menu shows only a text list of my top-level pages. There are many subpages below these pages, but it's hard for users to find them without going to different pages to try to find things. What I'm looking for: if someone hovers above "Store" in the menu, a dropdown menu showing all the product pages would appear. I've tried to find Wordpress plugins or PHP hacks to add dropdown menus and I have gone through Cutline's support site, but I have had no luck. Anyone in the Moz community have any advice? Perhaps I just need to revamp the site with a new and modern Wordpress theme? Thanks in advance for any thoughts!
Intermediate & Advanced SEO | | SamuelScott0 -
Do Structured Data Errors Hurt Your Rankings
Hi, I noticed we had a 161 structured data erros. Almost all of them said "Update missing." I included screenshorts. Are these hurting me? If so, what can I do to fix them? Thanks, Ruben UfzeAFi YjLhimj
Intermediate & Advanced SEO | | KempRugeLawGroup0 -
Using AggregateReview markup to generate star sinppets for wordpress blog posts? OK or Spam?
Do you believe using a plugin that allows visitors to rate your blog posts and then generating AggregateReview markup code for each blog post is in accordance with google guidelines? I saw there are a couple of wordpress plugins that offer this functionality. Anybody who had any negative experience? Star icons will certainly increase visibility in SERP.
Intermediate & Advanced SEO | | lcourse0 -
Duplicate Content on Wordpress b/c of Pagination
On my recent crawl, there were a great many duplicate content penalties. The site is http://dailyfantasybaseball.org. The issue is: There's only one post per page. Therefore, because of wordpress's (or genesis's) pagination, a page gets created for every post, thereby leaving basically every piece of content i write as a duplicate. I feel like the engines should be smart enough to figure out what's going on, but if not, I will get hammered. What should I do moving forward? Thanks!
Intermediate & Advanced SEO | | Byron_W0 -
Few questions regarding wordpress and indexing/no follow.
I'm using Yoast's Wordpress SEO plugin on my wordpress site which allows you to quickly set up nofollow / no index on specific taxonomies. I wanted to see what you guys thought was the best practice in setting up my various taxonomies. Would you noidex, but follow all of these, none of these, or just some of these: Categories, tags, media, author archives ( (My blog is mainly a single author blog (me) but my wife does sometimes write posts. So I didn't know how this effected everything. Also I could simply make the blog a single user blog and just have her posts be guest posts, but I'd rather leave her as a user.), and date archives. The example I read on line only no-index's the date archives. Just curious what you guys thought. Thanks.
Intermediate & Advanced SEO | | NoahsDad0 -
Moving wordpress to main website - errors galore
Hi there A while back I asked whether I should move my established blog on wordpress over to my main website http://www.gardenbeet.com. The overwhelming response was to move it. I still have not seen any benefit other than problems (2mths later). Maybe something is Not Quite Right? Here is a list of issues? Any insights would be welcome. I have installed Yeost SEO plugins 1. A blog category (garden art) ended up outranking my main website for the term Garden Art - so I did a 301 on the garden art category - main website has now regained its ranking in the top 3 position (was number 2 before the move). 2. I have removed the categories from the blog as a 301 on the Garden Art catgegory would not make sense to a user. I decided to use tags as a navigational tool instead. I figured that because I have over 500 tags I could not have had the tags out rank my main website for any key term. So far correct. But now i have wanings from SEO moz 500 missing meta tags,(for the tags) - do i really have to write over 500 meta? over 160 long urls and titles - when i commenced the blog I had no idea that Urls and headings were linked in wordpress ( my developer does not think i should rename the urls and use a 301 as I already have a tonne due to a site rebuild) - is it OK to leave the long urls and fix the title only? On wordpress I had bewteen 400-900 users on my blog a day (using wordpress analytics) - now only 200 (using GA) - Yes I now have increased links to my website but have seen no imrpovement on my SERPS - will i see an improvement in my rankings? my wordpress site use to have page rank of 3 -
Intermediate & Advanced SEO | | GardenBeet0