Robots.txt Showing in SERP Results
-
Currently doing a technical audit for a website and when I search "Site:website.com -www" the only result is website.com/robots.txt
I was wondering if anyone else has come across this before -- or what this may mean from a technical audit standpoint.
Thank you!
-
nonsense. Search for https://www.google.com.au/search?q=inurl%3Arobots.txt&pws=0
Some of the first results with visible robots.txt I see are:
I refuse to believe that "something is seriously wrong" with any of these sites.
-
It's quite common for Google to index robots.txt files. (and also, rather odd) But check out all of these robots.txt files:
https://www.google.com/search?q=inurl%3Arobots.txt&pws=0&gl=us
So it's nothing to be alarmed by. With your particular query. "Site:website.com -www" it only shows pages indexed without the "www" so this just says that all the indexed pages most likely begin with www. The exception, of course, is the robots.txt file.
The bigger question for me is, why does Google cache robots.txt files? Oh well.
-
The robots.txt file should not show up. Sounds like there is something seriously wrong.
-
Did you also search for site:www.webiste.com? Are they blocking the site?What's in the actual robots file?
Also, does this happen when you search for the site in Bing?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Utilizing one robots.txt for two sites
I have two sites that are facilitated hosting in similar CMS. Maybe than having two separate robots.txt records (one for every space), my web office has made one which records the sitemaps for the two sites, similar to this:
Technical SEO | | eulabrant0 -
Https problem on google result.
Hello everyone. My problem is SSL certificate... Send all links to google, after google shows https link no problem. But a few minutes ago my home page link not have an SSL..
Technical SEO | | dalapayal
Please check this page : https://www.bodrumtransfermarket.com Where do I make a mistake? Thanks for all...0 -
Robots.txt on http vs. https
We recently changed our domain from http to https. When a user enters any URL on http, there is an global 301 redirect to the same page on https. I cannot find instructions about what to do with robots.txt. Now that https is the canonical version, should I block the http-Version with robots.txt? Strangely, I cannot find a single ressource about this...
Technical SEO | | zeepartner0 -
Internal search : rel=canonical vs noindex vs robots.txt
Hi everyone, I have a website with a lot of internal search results pages indexed. I'm not asking if they should be indexed or not, I know they should not according to Google's guidelines. And they make a bunch of duplicated pages so I want to solve this problem. The thing is, if I noindex them, the site is gonna lose a non-negligible chunk of traffic : nearly 13% according to google analytics !!! I thought of blocking them in robots.txt. This solution would not keep them out of the index. But the pages appearing in GG SERPS would then look empty (no title, no description), thus their CTR would plummet and I would lose a bit of traffic too... The last idea I had was to use a rel=canonical tag pointing to the original search page (that is empty, without results), but it would probably have the same effect as noindexing them, wouldn't it ? (never tried so I'm not sure of this) Of course I did some research on the subject, but each of my finding recommanded one of the 3 methods only ! One even recommanded noindex+robots.txt block which is stupid because the noindex would then be useless... Is there somebody who can tell me which option is the best to keep this traffic ? Thanks a million
Technical SEO | | JohannCR0 -
Using robots.txt to deal with duplicate content
I have 2 sites with duplicate content issues. One is a wordpress blog. The other is a store (Pinnacle Cart). I cannot edit the canonical tag on either site. In this case, should I use robots.txt to eliminate the duplicate content?
Technical SEO | | bhsiao0 -
SERP data went away??
As of a day ago, the SERPs in Google are showing our listing with NO meta description at all and the incorrect title. Plus the Title is varying based on the keywords searched. Info: Something I just had done was have the multiple versions of their home page (duplicate content, about 40 URLs or so) 301 redirected the the appropriate place. I think they accidentally did 302s. Anyone seen this before? Thanks
Technical SEO | | poolguy0 -
How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
Today's sitemap webinar made me think about the disallow feature, seems opposite of sitemaps, but it also seems both are kind of ignored in varying ways by the engines. I don't need help semantically, I got that part. I just can't seem to find a contemporary answer about what should be blocked using the robots.txt file. For example, I have folders containing site comps for clients that I really don't want showing up in the SERPS. Is it better to not have these folders on the domain at all? There are also security issues I've heard of that make sense, simply look at a site's robots file to see what they are hiding. It makes it easier to hunt for files when they know the directory the files are contained in. Do I concern myself with this? Another example is a folder I have for my xml sitemap generator. I imagine google isn't going to try to index this or count it as content, so do I need to add folders like this to the disallow list?
Technical SEO | | SpringMountain0