Robots.txt is blocking Wordpress Pages from Googlebot?
-
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
-
Delete everything under the following directives and you should be good.
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
As a rule of thumb, it's not a good idea to use wild cards in your robots.txt file - you may be excluding an entire folder inadvertently.
-
Here is my robots.txt from google webmaster tools. These are the pages that are being blocked and I am not sure which of these to get rid of in order to unblock blog posts from being searched.
http://ensoplastics.com/theblog/?cat=743
http://ensoplastics.com/theblog/?p=240
These category pages and blog posts are blocked so do I delete the /? ...I am new to SEO and web development so I am not sure why the developer of this robots.txt file would block pages and posts in wordpress. It seems to me like that is the reason why someone has a blog so it can be searched and get more exposure for SEO purposes.
Sitemap: http://www.ensobottles.com/blog/sitemap.xml
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
User-agent: *Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/
Disallow: /trackback
Disallow: /commentsDisallow: /feed
-
I'm not sure to what extent your website is being blocked with the robots.txt file but it's pretty easy to diagnose. You'll first need to identify and confirm that googlebot is being blocked by typing in your web browser ~> www.mywebsite.com/robots.txt
If you see an entry such as "User-agent: *" or "User-agent: googlebot" being used in conjunction with "Disallow" then you know your website is being blocked with the robots.txt file. Given your situation you'll need to go through a two step process.
First, go into your wordpress plugin page and deactivate the plugin which generates your robots.txt file. Second, login to the root folder of your server and look for the robots.txt file. Lastly, change "Disallow" to "Allow" and that should work but you'll need to confirm by typing in the robots URL again.
Given the limited information in your question I hope that helps. If you run into any more issues don't hesitate to post them here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Specific page does not index
Hi, First question: Working on the indexation of all pages for a specific client, there's one page that refuses to index. Google Search console says there's a robots.txt file, but I can't seem to find any tracks of that in the backend, nor in the code itself. Could someone reach out to me and tell me why this is happening? The page: https://www.brody.be/nl/assistentiewoningen/ Second question: Google is showing another meta description than the one our client gave in in Yoast Premium snippet. Could it be there's another plugin overwriting this description? Or do we have to wait for it to change after a specific period of time? Hope you guys can help
Intermediate & Advanced SEO | | conversal0 -
Google is indexing wrong page for search terms not on that page
I’m having a problem … the wrong page is indexing with Google, for search phrases “not on that page”. Explained … On a website I developed, I have four products. For example sake, we’ll say these four products are: Sneakers (search phrase: sneakers) Boots (search phrase: boots) Sandals (search phrase: sandals) High heels (search phrase: high heels) Error: What is going “wrong” is … When the search phrase “high heels” is indexed by Google, my “Sneakers” page is being indexed instead (and ranking very well, like #2). The page that SHOULD be indexing, is the “High heels” page (not the sneakers page – this is the wrong search phrase, and it’s not even on that product page – not in URL, not in H1 tags, not in title, not in page text – nowhere, except for in the top navigation link). Clue #1 … this same error is ALSO happening for my other search phrases, in exactly the same manner. i.e. … the search phrase “sandals” is ALSO resulting in my “Sneakers” page being indexed, by Google. Clue #2 … this error is NOT happening with Bing (the proper pages are correctly indexing with the proper search phrases, in Bing). Note 1: MOZ has given all my product pages an “A” ranking, for optimization. Note 2: This is a WordPress website. Note 3: I had recently migrated (3 months ago) most of this new website’s page content (but not the “Sneakers” page – this page is new) from an old, existing website (not mine), which had been indexing OK for these search phrases. Note 4: 301 redirects were used, for all of the OLD website pages, to the new website. I have tried everything I can think of to fix this, over a period of more than 30 days. Nothing has worked. I think the “clues” (it indexes properly in Bing) are useful, but I need help. Thoughts?
Intermediate & Advanced SEO | | MG_Lomb_SEO0 -
Large robots.txt file
We're looking at potentially creating a robots.txt with 1450 lines in it. This will remove 100k+ pages from the crawl that are all old pages (I know, the ideal would be to delete/noindex but not viable unfortunately) Now the issue i'm thinking is that a large robots.txt will either stop the robots.txt from being followed or will slow our crawl rate down. Does anybody have any experience with a robots.txt of that size?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
Backlinks to internal pages
Hi, Our website of 3K+ pages currently has more links coming to internal pages (2nd & 3rd Level), compared to links to homepage. Just wanted to know if this is bad for rankings ? Please share your thoughts. Thanks.
Intermediate & Advanced SEO | | Umesh-Chandra0 -
What is the proper way to execute 'page to page redirection'
I need to redirection every page of my website to a new url of another site I've made. I intend to add:"Redirect 301 /oldpage.html http://www.example.com/newpage.html"I will use the 301 per page to redirect every page of my site, but I'm confused that if I add:"Redirect 301 / http://mt-example.com/" it will redirect all of my pages to the homepage and ignore the URLs i have separately mentioned for redirection.Please guide me.
Intermediate & Advanced SEO | | NABSID0 -
Page Indexed but not Cached
A section of pages on my site are indexed (I know because they appear in SERPs if I copy and paste a sentence from the content), however according to the text-only cached version of the page they are not being read by Google.Why are they indexed event hough it seems like Google is not reading them..... or is Google in fact reading this text even though it seems like they should not be?Thanks for your assistance.
Intermediate & Advanced SEO | | theLotter0 -
Manipulate Googlebot
**Problem: I have found something wierd on the server log as below. the googlebot visit the folders and files which do not exist at all. there is no photo folder on the server, but googlebot visit the files inside the photo folder and return 404 error. ** I wonder if it is SEO hacking attempts, and how can someone manage to Manipulate Googlebot. ================================================== **66.249.71.200 - - [22/Aug/2012:02:31:53 -0400] "GET /robots.txt HTTP/1.0" 200 2255 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" ** **66.249.71.25 - - [22/Aug/2012:02:36:55 -0400] "GET /photo/pic24.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.26 - - [22/Aug/2012:02:37:03 -0400] "GET /photo/pic20.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.200 - - [22/Aug/2012:02:37:11 -0400] "GET /photo/pic22.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.200 - - [22/Aug/2012:02:37:28 -0400] "GET /photo/pic19.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.26 - - [22/Aug/2012:02:37:36 -0400] "GET /photo/pic17.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.71.200 - - [22/Aug/2012:02:37:44 -0400] "GET /photo/pic21.html HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" **
Intermediate & Advanced SEO | | semer1