Wordpress Blog Blocked by Metarobots
-
Upon receiving my first crawl report from new pro SEOMoz acc (yaay!) I've found that the wordpress blog plugged into my site hasn't been getting crawled due to being blocked by metarobots.
I'm not a developer and have very little tech expertise, but a search dug up that the issue stemmed from the wordpress site settings > privacy > Ask search engines not to index this site option being selected.
On checking the blog "Allow search engines to index this site" was selected so I'm unsure what else to check. My level of expertise means I'm not confident going into the back end of the site and I don't have a tech guy on site to speak to.
Has anyone else had this problem? Is it common and will I need to consult a developer to get this fixed?
Many thanks in advance for your help!
-
I didn't think there were any issues with the blog being crawled. I'm not seeing any errors in webmaster tools, and I'm def not doing anything tricky on the server side.
I don't even go near that stuff for fear of breaking summat.
Really appreciate your help Barry.
All the best,7
Pete
-
There shouldn't be a robots.txt file on the /blog section anyway, should always be in the root. It was just something to have a look at.
I'm having a look just now and also don't see any problems.
You've nothing in the robots.txt file and nothing in meta-robots for the header.
There's 42 pages in the site: command and a similar number in your sitemap.xml so I presume that's right. 6 pages in site:/blog which again looks right.
I've tried using SEOmoz's tools on your site though and it just tells me that your site doesn't resolve. edit Managed to get it to resolve on the 3rd try for a crawl, but using the on page report card checker it's still giving me problems.
You're definitely returning a 200 message with a site when I check using any other tool though, so I'd get in touch with SEOmoz directly and see what's wrong with their tool - help@seomoz.org
Just to confirm you're not doing anything tricky server side to prevent scraping are you?
-
Hi Barry,
Thanks for the reply, I'm checking out your recommendations now..
I checked http://debtmadesimple.co.uk/robots.txt and there is no Disallow for the blog.
I tried http://debtmadesimple.co/uk/wp-install/robots.txt I can't access the file you speak of.
I will try and download the plugin you mentioned, it would be good to get access to the robot file nonetheless.
Thanks again!
Pete
-
Hi Zach,
First I'd like to thank you for the speedy reply, I really appreciate your help.
The URL of the blog is http://www.debtmadesimple.co.uk/blog/.
Thanks again!
Pete
-
If you're not taking Zach up on his offer, have a look at http://yoursite.com/robots.txt and see if it has
User-agent: *
Disallow: (your blog url in here)If it does you'll need to edit your robots.txt file to not have anything you don't want disallowed in the disallow section. You can do this via ftp.
If it's in WP itself there may be another robots.txt file at http://yoursite.com/wp-install/robots.txt which, in theory, could also be preventing crawling if it has anything disallowed in there.
Again, editable via ftp or maybe this plugin - http://wordpress.org/extend/plugins/wp-robots-txt/
As it already says that it should be public probably not WP, but worth a look anyway.
-
I'm a WP developer and an SEO, i'd be more than willing to do some troubleshooting here on the forums for you. If the settings>privacy is checked to allow search engines to crawl, then I doubt it's a WordPress issue in itself, though a plugin could do this.
What is the URL of your site? You may have a robots.txt that is blocking search engine crawlers, i've also seen a thing where all URLs on the site are noinexed and nofollowed.
Let me know and i'll take a quick look for you.
Zach
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is having the same title tag on a blog listing page and blog date archives an SEO issue?
Hi there, Can anyone answer whether having duplicate title tags on the blog listing page (e.g.https://blog.companyname.com/) and the blog date archive pages (e.g.https://blog.companyname.com/archive/2017/10) is an issue? If so why is it an issue and what are the best practices of dealing with this? Thanks! John
Technical SEO | | SEOCT1 -
Yoast and wordpress duplicate meta
I'm using the Yoast plugin with wordpress and have noticed in my HTML I have duplicate meta data. For example my header starts with
Technical SEO | | simonatkinsphoto
<title>(title) </title<span><<br /><meta </span><span class="html-attribute-name">property</span><span>="</span><span class="html-attribute-value">og:site_name</span><span>" </span><span class="html-attribute-name">content</span><span>=<br /><span><meta </span><span class="html-attribute-name">property</span><span>="</span><span class="html-attribute-value">og:description</span><span>" </span><span class="html-attribute-name">content</span><span>=<br /><br /></span></span>Then I have the 'This site is optimised by Yoast" tagline followed by the same meta -<br /> <span><meta </span><span class="html-attribute-name">name</span><span>="</span><span class="html-attribute-value">description</span><span>" </span><span class="html-attribute-name">content=<br /><span> <meta </span><span class="html-attribute-name">property</span><span>="</span><span class="html-attribute-value">og:title</span><span>" content=<br /><span> <meta </span><span class="html-attribute-name">property</span><span>="</span><span class="html-attribute-value">og:description</span><span>" </span><span class="html-attribute-name">content=<br /><span> <meta </span><span class="html-attribute-name">property</span><span>="</span><span class="html-attribute-value">og:site_name</span><span>" </span><span class="html-attribute-name">content</span><span>=<br /><br /></span></span></span></span>Is this likely to cause problems with Google and is there a way to stop both wordpress and Yoast adding meta to the header. </p></title>0 -
Weird Blog tags and re-directs
Hello fellow Digital Marketeers! As an in-house kinda guy, I rarely get to audit sites other than my own. But, I was tasked with auditing another. So I ran it through Screaming Frog and the usual tools. I got a couple of URLs come back with timeout messages, so I checked them manually- they're apparently part of a blog's archive: http://www.bestpracticegroup.com/tag/training-2/ I click 'read more' and it takes you to: http://www.bestpracticegroup.com/pfi-contracts-3-myth-busters-to-help-achieve-savings/ The first URL seems entirely redundant. Has anyone else seen something like this? Just an explanation as to why something like that would exist, and how you'd handle that would be grand! Much appreciated, John.
Technical SEO | | Muhammad-Isap0 -
Bolt on Blog Software
We have several large eCommerce websites built on Cold Fusion. It is running on IIS, not Apache. We are looking for a blogging package (CMS) that we can bolt on to the website. We don't want the blog residing in a sub-domain. The blog needs to reside in a folder. NO => blog.mydomain.com YES => www.mydomain.com/blog/ Has anyone ever adapted Wordpress for this type of situation? Can WordPress reside in a folder? Are there any other suggestions?
Technical SEO | | AMHC0 -
Should I use a canonical tag or 301 with Wordpress posts?
Hi all, I'm trying to determine if canonical or 301 is a better way of handling an issue on my site. The Background I've got a Wordpress website where pages are in-depth reference articles and the posts are for short news blurbs. When I produce a new resource page, I also make a short post telling readers about the new resource. I use Yoast's Wordpress SEO plug in. Sometimes, Google will rank the 200 word post higher than the 2000 word resource page. I suspect that is because of the order in which they were crawled by Google, but I do not know for sure. The Question To make sure that the resource page is seen as the most important location on the site for the topic, should I use the canonical section in the Yoast plugin on the post to point to the page? Or should I wait, and after a few days (when the news blurb is off of the first page) just 301 the post to the page? Are there any link juice considerations when using the canonical option? Thanks for the help! Richard
Technical SEO | | RichardInFlorida0 -
How to add specific Tumblr blogs into a disavow file?
Hi guys, I am about to send a reconsideration letter and still finalizing my disavow file. The format of the disavow is Domain:badlink.com (stripping out to the root domain) but what about those toxic links that are located in tumblr such as: badlink.tumblr.com? The issue is that there are good tumblr links we got so I don't want to add just tumblr.com so do you guys think I will have issues submitting badlink.tumblr.com and not tumblr.com? Thank you!
Technical SEO | | Ideas-Money-Art0 -
Accidentally blocked Googlebot for 14 days
Today after I noticed a huge drop in organic traffic to inner pages of my sites, I looked into the code and realized a bug in last commit cause the server to showing captcha pages to all Googlebot requests from Apr 24. My site has more than 4,000,000 in the index. Before last code change, Googlebot are exempt from being shown the captcha requests so each inner pages are crawled and indexed perfectly with no problem. The bug broke the whitelisting mechanism and treat requests from Google's ip addresses the same as regular users. It leads to the captcha page being crawled when Googlebot visits thousands of my site's inner pages. This makes Google thinks all my inner pages are identical to each other. Google remove all the inner pages from SERP starting from May 5th before when many of those inner pages have good rankings. I formerly thought this was a manual or algorithm penalty but 1. I did not receive a warning message in GWT
Technical SEO | | Bull135
2. The ranking for main url is good. I tried with "Fetch as Google" in GWT and realize all Googlebot saw in the past 14 days are the same captcha page for all my inner pages. Now, I have fixed the bug and updated the production site. I just wanted to ask: 1. How long will it take for Google to remove the "duplicated content" flag on my inner pages and show them in SERP again? From my experience, Googlebot revisits urls quite often. But once a url is flagged as "contains similar content", it could be difficult to recover, is it correct? 2. Besides waiting for Google to update its index, what else can I do right now? Thanks in advance for your answers.0 -
Creating a Blog of Rodent Removal Companies?
I am helping a small company. Lets say rodent removal is their service. But local SEO for rodent removal is very very competitive in my town and across America. Would a website/blog dedicated to highlighting rodent removers across America be good for my company? We have had nice success with wordpress.com blogs. Supposing I gave 6 other rodent removal companies a free guest post (always 300 words or more) or whatever to post on my blog. Of course, none of these companies would be in my market. Would that help my local SEO? I am thinking long term here?
Technical SEO | | greenhornet770