Site not being Indexed that fast anymore, Is something wrong with this Robots.txt
-
My wordpress site's robots.txt used to be this:
User-agent: *
Disallow: Sitemap: http://www.domainame.com/sitemap.xml.gz I also have all in one SEO installed and other than posts, tags are also index,follow on my site.
My new posts used to appear on google in seconds after publishing. I changed the robots.txt to following and now post indexing takes hours.
Is there something wrong with this robots.txt? User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /wp-login.php
Disallow: /wp-login.php
Disallow: /trackback
Disallow: /feed
Disallow: /comments
Disallow: /author
Disallow: /category
Disallow: */trackback
Disallow: */feed
Disallow: */comments
Disallow: /login/
Disallow: /wget/
Disallow: /httpd/
Disallow: /*.php$
Disallow: /?
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*.gz$
Disallow: /*.wmv$
Disallow: /*.cgi$
Disallow: /*.xhtml$
Disallow: /?
Disallow: /*?Allow: /wp-content/uploads
User-agent: TechnoratiBot/8.1
Disallow:
ia_archiverUser-agent: ia_archiver
Disallow: /
disable duggmirror
User-agent: duggmirror
Disallow: /
allow google image bot to search all imagesUser-agent: Googlebot-Image
Disallow: /wp-includes/
Allow: /*
# allow adsense bot on entire siteUser-agent: Mediapartners-Google*
Disallow:
Allow: /*
-
I am not sure why you are setting disallow of file types. Google would not index wmv or js etc anyway as it cannot parse that type of file for data. If you want to coax google into indexing your site submit a sitemap in webmaster tools. You could also set NoFollow on the anchors for the pages you want to exclude and keep robots.txt cleaner by just including top level subdirectories such as admin etc. There just seems to be a lot of directories in there that do not relate to actual pages, and google is only concerned with renderable pages.
-
Hello,
Robots.txt, allow or disallow access to certain files or folders. He can not delay or slow down access. I do not think the problem is the robots.txt
Radu
-
Why don't you revert back to the original robots.txt and determine for certain that the problem is with this file?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Not Being Indexed
Hey Everyone - I have a site that is being treated strangely by google (at least strange to me) The site has 24 pages in the sitemap - submitted to WMT'S over 30 days ago I've manually triggered google to crawl the homepage and all connecting links as well and submitted a couple individually. Google has been parked the indexing at 14 of the 24 pages. None of the unindexed URL's have Noindex or follow tags on them - they are clearly and easily linked to from other places on the site. The site is a brand new domain, has no manual penalty history and in my research has no reason to be considered spammy. 100% unique handwritten content I cannot figure out why google isn't indexing these pages. Has anyone encountered this before? Know any solutions? Thanks in advance.
Technical SEO | | CRO_first0 -
Site not getting indexed by googlebot.
The following question is in regards to http://footeschool.org/. This site is not getting indexed with google(googlebot) This only happens when the user agent is set googlebot. This is a recent issue. We are using DNN as CMS. Are there any suggestion to help resolve this issue?
Technical SEO | | bcmull0 -
Bing is not Indexing my site.
Hi, My website is four months old and has more than 8000 pages. Bing has indexed only 8 pages till date and Google also keeps playing hide and seek with it. There was a time when google indexed almost all the pages of my site but now there are only 5000 pages indexed. Moreover when I check my site on google (by typing site:socktail.com), it shows only 26 pages. Please let me know what should I do. If somebody wants to take a look, my website is http://socktail.com Thanks
Technical SEO | | saurabh19050 -
Site removed from Google Index
Hi mozers, Two months ago we published http://aquacion.com We registered it in the Google Webmaster tools and after a few day the website was in the index no problem. But now the webmaster tools tell us the URLs were manually removed. I've look everywhere in the webmaster tools in search for more clues but haven't found anything that would help me. I sent the acces to the client, who might have been stupid enough to remove his own site from the Google index, but now, even though I delete and add the sitemap again, the website won't show in Google SERPs. What's weird is that Google Webmaster Tools tells us all the page are indexed. I'm totally clueless here... Ps. : Added screenshots from Google Webmaster Tools. Update Turns out it was my mistake after all. When my client developped his website a few months ago, he published it, and I removed the website from the Google Index. When the website was finished I submited the sitemap, thinking it would void the removal request, but it don't. How to solve In webmaster tools, in the [Google Index => Remove URLs] page, you can reinclude pages there. tGib0
Technical SEO | | RichardPicard0 -
"Extremely high number of URLs" warning for robots.txt blocked pages
I have a section of my site that is exclusively for tracking redirects for paid ads. All URLs under this path do a 302 redirect through our ad tracking system: http://www.mysite.com/trackingredirect/blue-widgets?ad_id=1234567 --302--> http://www.mysite.com/blue-widgets This path of the site is blocked by our robots.txt, and none of the pages show up for a site: search. User-agent: * Disallow: /trackingredirect However, I keep receiving messages in Google Webmaster Tools about an "extremely high number of URLs", and the URLs listed are in my redirect directory, which is ostensibly not indexed. If not by robots.txt, how can I keep Googlebot from wasting crawl time on these millions of /trackingredirect/ links?
Technical SEO | | EhrenReilly0 -
Best practice for eCommerce site migration, should I 301 redirect or match URLs on new site
Hi Guys, I have been struggling with this one for quite some time. I am no SEO expert like many of you, rather just a small business owner trying to do the right thing, so forgive me if I say something that makes no sense 🙂 I am moving our eCommerce store from one platform to another, in the process the store is getting a massive face lift. The part I am struggling with is whether I should keep my existing URL structure in place or use 301 redirects to create a cleaner looking URLs. Currently the URLs are a little long and I would love to move to a /category/product_name type format. Of course the goal is not to lose ranking in the process, I rank pretty well for several competitive phrases and do not want to create a negative impact. How would you guys handle this? Thanks, Dinesh
Technical SEO | | MyFairyTaleBooks0 -
Robots.txt usage
Hey Guys, I am about make an important improvement to our site's robots.txt we have large number of properties on our site and we have different views for them. List, gallery and map view. By default list view shows up and user can navigate through gallery view. We donot want gallery pages to get indexed and want to save our crawl budget for more important pages. this is one example of our site: http://www.holiday-rentals.co.uk/France/r31.htm When you click on "gallery view" URL of this site will remain same in your address bar: but when you mouse over the "gallery view" tab it will show you URL with parameter "view=g". there are number of parameters: "view=g, view=l and view=m". http://www.holiday-rentals.co.uk/France/r31.htm?view=l http://www.holiday-rentals.co.uk/France/r31.htm?view=g http://www.holiday-rentals.co.uk/France/r31.htm?view=m Now my question is: I If restrict bots by adding "Disallow: ?view=" in our robots.txt will it effect the list view too? Will be very thankful if yo look into this for us. Many thanks Hassan I will test this on some other site within our network too before putting it to important one's. to measure the impact but will be waiting for your recommendations. Thanks
Technical SEO | | holidayseo0 -
Non-www home page indexed, but www for rest of site
Hi there, grateful for any ideas on why this is happening: http://www.google.co.uk/search?q=site:www.vitispr.com vs http://www.google.co.uk/search?q=site:vitispr.com Google seems to be indexing and caching vitispr.com for our home page but the www. versions for everything else. As you can see the second query finds the home page. Any ideas why that might be? Other info that might be relevant: non-www etc. are all 301'd to www versions. moved domains/urls etc. around in March of this year and for a week or we were redirecting to the non-www version webmaster tools says 'www' preferred Thanks!
Technical SEO | | JaspalX0