Correct robots.txt for WordPress
-
Hi. So I recently launched a website on WordPress (1 main page and 5 internal pages). The main page got indexed right off the bat, while other pages seem to be blocked by robots.txt. Would you please look at my robots file and tell me what‘s wrong?
I wanted to block the contact page, plugin elements, users’ comments (I got a discussion space on every page of my website) and website search section (to prevent duplicate pages from appearing in google search results). Looks like one of the lines is blocking every page after ”/“ from indexing, even though everything seems right.
Thank you so much.
-
Me too, can you upload or screenshot the actual file that you are using
-
I have edited it down to
User-Agent: * Allow: /wp-content/uploads/ Disallow: /wp-content/plugins/ Disallow: /wp-admin/ Disallow: /contact/ Disallow: /refer/ It didn’t help. I get a “Blocked by robots.txt” message after submitting the URL for indexing in google webmaster tools. I’m really puzzled.
-
Hi, in addition to the answer that effectdigital gave; another option,optimised for WordPress:
User-Agent: *
Allow: /wp-content/uploads/
Disallow: /wp-content/plugins/
Disallow: /wp-admin/
Disallow: /readme.html
Disallow: /refer/Sitemap: http://www.example.com/post-sitemap.xml
Sitemap: http://www.example.com/page-sitemap.xml -
Just seems overly complex and like there's way more in there than there needs to be
I'd go with something that 'just' does what you have stated that you want to achieve, and nothing else
User-Agent: *
Disallow: /wp-content/plugins/
Disallow: /comments
Disallow: /*?s=
Disallow: /*&s=
Disallow: /search
See if that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will adding 1M (legitimate/correct) internal backlinks to an orphan page trip algo penalty?
We have a massive long tail user generated gamification strategy that has worked really well. Because of that success we haven't really been paying enough attention to SEO and in looking around caught some glaring issues. The section of our site that works as long tail goes from overview page > first classification > sub classification > specific long tail term page. Looks like we were relying on google to crawl/use forms to go from our overview page to the first classification BUT those resulting pages were orphaned - so www.mysite.com/product/category_1 defaulted back to the search page creating duplicate issues. www.mysite.com/product/category_1 and www.mysite.com/product/category_2 and www.mysite.com/product/category_3 all had duplicate content as they all reverted to the overview page. It's clear we need to make an actual breadcrumb trail and proper site taxonomy/linkage. I'm wanting to do this on just this one area first, but it's a big section with over 3M indexed "specific long tail term pages". I want to just add a simple breadcurmb trail in a sub navigation menu but doing so will literally create millions of new internal backlinks from specific term pages to their sub & parent category pages. Although we're missing the intermediary category breadcrumbs, we did have a breadcrumb coming back to the main overview page - that was tagged nofollow. So now I'm contemplating adding millions of (proper) backlinks and removing a nofollow tag from another million internal back links. All of this seems in line with "best practices" but what I have not been able to determine is if there is a proper/better way to roll these changes out so as to not trigger an algorithm penalty. I am also reticent about making too many changes too quickly but these are SEO 101 basics that need to be rectified. Is it a mistake to make good improvements too quickly? Thanks!
On-Page Optimization | | DrewProZ1 -
Moz showing 384 description duplicates on my ecommerce store.... when I download CSV, most pages are coming from my WordPress Blog, why?
Hi, I am trying to investigate why I am getting 384 description duplicates on my ecommerce store (www.doggie-diva.com)? When I download the CSV file from MOZ, the majority of the pages they refer to are pages from my Word Press blog, which is hosted on a different server (blog.doggie-diva.com). I do have a link from my website to my Word Press blog and vice versa. Can you please explain to me why this is happening when I don't have duplicate content? Example of a page flagged from www.doggie-diva.com with duplicate content (http://blog.doggie-diva.com/tag/dog-gymnastics. Thanks, Rachel <colgroup><col width="549"></colgroup>
On-Page Optimization | | doggiedivalicious
| |0 -
Need suggestion: Should the user profile link be disallowed in robots.txt
I maintain a myBB based forum here. The user profile links look something like this http://www.learnqtp.com/forums/User-Ankur Now in my GWT, I can see many 404 errors for user profile links. This is primarily because we have tight control over spam and auto-profiles generated by bots. Either our moderators or our spam control software delete such spammy member profiles on a periodic basis but by then Google indexes those profiles. I am wondering, would it be a good idea to disallow User profiles links using robots.txt? Something like Disallow: /forums/User-*
On-Page Optimization | | AnkurJ0 -
WordPress Crawl Errors
I recently added wordpress to my site and get the following errors: Duplicate Page Content http://agrimapper.com/wordpress/ http://agrimapper.com/wordpress/index.php How do I define the canonical page on a .php. 4XX (Client Error) http://agrimapper.com/wordpress/index.phpindex.php Any ideas where the 4XX error comes from. Thanks.
On-Page Optimization | | MSSBConsulting0 -
New CMS system - 100,000 old urls - use robots.txt to block?
Hello. My website has recently switched to a new CMS system. Over the last 10 years or so, we've used 3 different CMS systems on our current domain. As expected, this has resulted in lots of urls. Up until this most recent iteration, we were unable to 301 redirect or use any page-level indexation techniques like rel 'canonical' Using SEOmoz's tools and GWMT, I've been able to locate and redirect all pertinent, page-rank bearing, "older" urls to their new counterparts..however, according to Google Webmaster tools 'Not Found' report, there are literally over 100,000 additional urls out there it's trying to find. My question is, is there an advantage to using robots.txt to stop search engines from looking for some of these older directories? Currently, we allow everything - only using page level robots tags to disallow where necessary. Thanks!
On-Page Optimization | | Blenny0 -
Right way to block google robots from ppc landing pages
What is the right way to completely block seo robots from my adword landing pages? Robots.txt does not work really good for that, as far I know. Adding metatags noindex nofollow on the other side will block adwords robot as well. right? Thank you very much, Serge
On-Page Optimization | | Kotkov0 -
URL structure for a new WordPress site
Hi I'm building a new next big thing website from scratch (for a translation agency) and I encountered an issue with the URL structure. I need to chose the URL for important targeted keyword pages and I have a conflict between two tools I'm using. Please read below the situation: domain: mashtranslation.com target keyword: french translation services which URL you think is better from a SEO point of view (and possibly for users): mashtranslation.com/services/french/ OR mashtranslation.com/french-translation-services/ I'm asking this because one WordPress plugin (Wordpress SEO by Yoast) says the URL structure is not optimised while another tool (Market Samurai) says the URL is optimised.
On-Page Optimization | | flo20