Robots.txt query
-
Quick question, if this appears in a clients robots.txt file, what does it mean?
Disallow: /*/_/
Does it mean no pages can be indexed? I have checked and there are no pages in the index but it's a new site too so not sure if this is the problem.
Thanks
Karen
-
Thank you so much, that is a great help!
-
That blocks all spiders from viewing those pages. I am not sure what and who did the /* /_/, but unless there is something there they don't want indexed then it is not necessary to keep it.
One thing you mind want to keep in mind as well, just because you block it on robots txt, doesn't mean a spider can't still go there.
Sometimes they don't listen to the robots txt(looking at you baidu)
-
User-agent: *
Thanks for your response.
-
What is the user agent?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I add my html sitemap to Robots?
I have already added the .xml to Robots. But should I also add the html version?
Technical SEO | | Trazo0 -
Should a login page for a payroll / timekeeping comp[any be no follow for robots.txt?
I am managing a Timekeeping/Payroll company. My question is about the customer login page. Would this typically be nofollow for robots?
Technical SEO | | donsilvernail0 -
How can I make it so that robots.txt is not ignored due to a URL re-direct?
Recently a site moved from blog.site.com to site.com/blog with an instruction like this one: /etc/httpd/conf.d/site_com.conf:94: ProxyPass /blog http://blog.site.com
Technical SEO | | rodelmo4
/etc/httpd/conf.d/site_com.conf:95: ProxyPassReverse /blog http://blog.site.com It's a Wordpress.org blog that was set as a subdomain, and now is being redirected to look like a directory. That said, the robots.txt file seems to be ignored by Google bot. There is a Disallow: /tag/ on that file to avoid "duplicate content" on the site. I have tried this before with other Wordpress subdomains and works like a charm, except for this time, in which the blog is rendered as a subdirectory. Any ideas why? Thanks!0 -
Robots User-agent Query
Am I correct in saying that the allow/disallow is only applied to msnbot_mobile? mobile robots file User-agent: Googlebot-Mobile User-agent: YahooSeeker/M1A1-R2D2 User-agent: MSNBOT_Mobile Allow: / Disallow: /1 Disallow: /2/ Disallow: /3 Disallow: /4/
Technical SEO | | ThomasHarvey1 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0 -
Does RogerBot read URL wildcards in robots.txt
I believe that the Google and Bing crawlbots understand wildcards for the "disallow" URL's in robots.txt - does Roger?
Technical SEO | | AspenFasteners0 -
Query String Redirection
In PHP, I'm wanting to store a session variable based upon a link that's clicked. I'm wanting to avoid query strings on pages that have content. My current workaround is to have a link with query strings to a php file that does nothing but snags the variables via $_GET, stores them into $_SESSION, and then redirects. For example, consider this script, that I have set up to force to a mobile version. Accessed via something like a href="forcemobile.php?url=(the current filename)" session_start(); //Location of vertstudios file on your localhost. Include trailing slash $loc = "http://localhost/web/vertstudios/"; //If GET variable not defined, this page is being accessed directly. //In that case, force to 404 page. Same case for if mobile session variable //not defined. if(!(isset($_GET["url"]) && isset($_SESSION["mobile"]))){ header("Location: http://www.vertstudios.com/404.php"); exit(); } //Snag the URL $url = $_GET["url"]; //Set the mobile session to true, and redirect to specified URL $_SESSION["mobile"] = true;header("Location: " . $loc . $url); ?> Will this circumvent the issue caused by using query strings?
Technical SEO | | JoeQuery0