Best way to create robots.txt for my website
-
How I can create robots.txt file for my website guitarcontrol.com ?
It is having login and Guitar lessons.
-
Hi,
First you need to understand your website need, you have to decide which part of your website should not be indexed or crawled by SE bots, like your website provides user login and user areas, if you are providing private dashboard for your user then it should be blocked by robots.txt (or you can use meta tag to prevent robots from crawling and indexing your particular page like ) or you can learn more about robots.txt here https://moz.com/learn/seo/robotstxt
Hope it helps
-
I see that you're on WordPress.
This CMS create "virtual" robots.txt. You can see this here:
https://codex.wordpress.org/Search_Engine_Optimization_for_WordPress#Robots.txt_OptimizationBut on your website there is error in robots.txt and you should see in web server log files (access and error) why this is happening. Also you may need looking .htaccess because something preventing this text file to be accessed.
There is alternative way for using robots.txt in WordPress. All you need is to create new and blank robots.txt in same folder and put this there:
User-agent: *
Disallow:Then save file and that's all. Now bad news - WP can't control indexing and crawling anymore.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No descripton on Google/Yahoo/Bing, updated robots.txt - what is the turnaround time or next step for visible results?
Hello, New to the MOZ community and thrilled to be learning alongside all of you! One of our clients' sites is currently showing a 'blocked' meta description due to an old robots.txt file (eg: A description for this result is not available because of this site's robots.txt) We have updated the site's robots.txt to allow all bots. The meta tag has also been updated in WordPress (via the SEO Yoast plugin) See image here of Google listing and site URL: http://imgur.com/46wajJw I have also ensured that the most recent robots.txt has been submitted via Google Webmaster Tools. When can we expect these results to update? Is there a step I may have overlooked? Thank you,
Technical SEO | | adamhdrb
Adam 46wajJw0 -
Which one is the best
Dear Seo experts, 1,5 month ago i started a informative website, i started it with a blank registrated domainname. Now 1 month further I've stacked the website with content and did much linkbuilding. Yesterday i ve bought a domainname from quarantine, its a domainname around 6 years old and has a bunch of backlinks already. What to do next? The first one has good content and good recent linkbuilding done. The second is a better domainname and is old and has old backlinks. And also higher PA and DA then the first one. Should i now go for the first one and 301 redirect the old domainname to the new one. Or should I do it the opposite way, 301 redirect the new website to the old domainname and move all content to the old domainname and try to move all linkbuilding to older domain? Hopefully anyone could give me a great answere, thank you so much! Kind regards, Menno
Technical SEO | | MennoO0 -
Google (GWT) says my homepage and posts are blocked by Robots.txt
I guys.. I have a very annoying issue.. My Wordpress-blog over at www.Trovatten.com has some indexation-problems.. Google Webmaster Tools data:
Technical SEO | | FrederikTrovatten22
GWT says the following: "Sitemap contains urls which are blocked by robots.txt." and shows me my homepage and my blogposts.. This is my Robots.txt: http://www.trovatten.com/robots.txt
"User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/ Do you have any idea why it says that the URL's are being blocked by robots.txt when that looks how it should?
I've read a couple of places that it can be because of a Wordpress Plugin that is creating a virtuel robots.txt, but I can't validate it.. 1. I have set WP-Privacy to crawl my site
2. I have deactivated all WP-plugins and I still get same GWT-Warnings. Looking forward to hear if you have an idea that might work!0 -
Website hacked
Hi I've been asked to help a colleague with his website. It seems to be hacked. He recently received an e-mail from Google saying his adwords account was suspended 'due to high probability his site may be hosting or distributing malicious software' I just checked his source and there seems to loads of weird on code on his pages, this would not have been but on by any members of the website owners. Please image attached when we try to access his website via google search I just contacted the hosting provider - does anyone have experience with this and how to prevent such hacking in the future. The site is build using HTML with no CMS. IjW19.jpg
Technical SEO | | Socialdude0 -
Robot.txt pattern matching
Hola fellow SEO peoples! Site: http://www.sierratradingpost.com robot: http://www.sierratradingpost.com/robots.txt Please see the following line: Disallow: /keycodebypid~* We are trying to block URLs like this: http://www.sierratradingpost.com/keycodebypid~8855/for-the-home~d~3/kitchen~d~24/ but we still find them in the Google index. 1. we are not sure if we need to specify the robot to use pattern matching. 2. we are not sure if the format is correct. Should we use Disallow: /keycodebypid*/ or /*keycodebypid/ or even /*keycodebypid~/? What is even more confusing is that the meta robot command line says "noindex" - yet they still show up. <meta name="robots" content="noindex, follow, noarchive" /> Thank you!
Technical SEO | | STPseo0 -
Robots.txt
should I add anything else besides User-Agent: * to my robots.txt file? http://melo4.melotec.com:4010/
Technical SEO | | Romancing0 -
Robots.txt
Hi there, My question relates to the robots.txt file. This statement: /*/trackback Would this block domain.com/trackback and domain.com/fred/trackback ? Peter
Technical SEO | | PeterM220 -
Trying to reduce pages crawled to within 10K limit via robots.txt
Our site has far too many pages for our 10K page PRO account which are not SEO worthy. In fact, only about 2000 pages qualify for SEO value. Limitations of the store software only permit me to use robots.txt to sculpt the rogerbot site crawl. However, I am having trouble getting this to work. Our biggest problem is the 35K individual product pages and the related shopping cart links (at least another 35K); these aren't needed as they duplicate the SEO-worthy content in the product category pages. The signature of a product page is that it is contained within a folder ending in -p. So I made the following addition to robots.txt: User-agent: rogerbot
Technical SEO | | AspenFasteners
Disallow: /-p/ However, the latest crawl results show the 10K limit is still being exceeded. I went to Crawl Diagnostics and clicked on Export Latest Crawl to CSV. To my dismay I saw the report was overflowing with product page links: e.g. www.aspenfasteners.com/3-Star-tm-Bulbing-Type-Blind-Rivets-Anodized-p/rv006-316x039354-coan.htm The value for the column "Search Engine blocked by robots.txt" = FALSE; does this mean blocked for all search engines? Then it's correct. If it means "blocked for rogerbot? Then it shouldn't even be in the report, as the report seems to only contain 10K pages. Any thoughts or hints on trying to attain my goal would REALLY be appreciated, I've been trying for weeks now. Honestly - virtual beers for everyone! Carlo0