Have a Robots.txt Issue
-
I have a robots.txt file error that is causing me loads of headaches and is making my website fall off the SE grid. on MOZ and other sites its saying that I blocked all websites from finding it. Could it be as simple as I created a new website and forgot to re-create a robots.txt file for the new site or it was trying to find the old one? I just created a new one.
Google's website still shows in the search console that there are severe health issues found in the property and that it is the robots.txt is blocking important pages. Does this take time to refresh? Is there something I'm missing that someone here in the MOZ community could help me with?
-
Hi primemediaconsultants!
Did this get cleared up?
-
You not always have to do this, if you would go to domain.com/robots.txt then it should be removed maybe already. If that's the case you should be starting to see an increase in the number of pages crawled in Google Search Console.
-
This seems very helpful as I did remove it, and fetch as google, but i'm a complete novice. How do you clear server cache?
-
What does your robots.txt file contain? (or share the link)
Try removing it, clearing server cache and fetching as google again.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Panda, rankings and other non-sense issues
Hello everyone I have a problem here. My website has been hit by Panda several times in the past, the first time back in 2011 (first Panda ever) and then another couple of times since then, and, lastly, the last June 2016 (either Panda or Phantom, not clear yet). In other words, it looks like my website is very prone to "quality" updates by big G: http://www.virtualsheetmusic.com/ Still trying to understand how to get rid of Panda related issues once for all after so many years of tweaking and cleaning my website of possible duplicate or thin content (301 redirects, noindexed pages, canonicals, etc), and I have tried everything, believe me. You name it. We recovered several times though, but once in a while, we are still hit by that damn animal. It really looks like we are in the so called "grey" area of Panda, where we are "randomly" hit by it once in a while. Interestingly enough, some of our competitors live joyful lives, at the top of the rankings, without caring at all about Panda and such, and I can't really make a sense of it. Take for example this competitors of ours: http://8notes.com They have a much smaller catalog than ours, worse quality of offered music, thousands of duplicate pages, ads everywhere, and yet... they are able to rank 1st on the 1st page of Google for most of our keywords. And for most, I mean, 99.99% of them. Take for example "violin sheet music", "piano sheet music", "classical sheet music", "free sheet music", etc... they are always first. As I said, they have a much smaller website than ours, with a much smaller offering than ours, their content quality is questionable (not cured by professional musicians, and highly sloppy done content as well as design), and yet they have over 480,000 pages indexed on Google, mostly duplicate pages. They don't care about canonicals to avoid duplicate content, 301s, noindex, robot tags, etc, nor to add text or user reviews to avoid "thin content" penalties... they really don't care about anything of that, and yet, they rank 1st. So... to all the experts out there, my question is: Why's that? What's the sense or the logic beyond that? And please, don't tell me they have a stronger domain authority, linking root domains, etc. because according to the duplicate and thin issues I see on that site, nothing can justify their positions in my opinion and, mostly, I can't find a reason why we instead are so much penalized by Panda and such kind of "quality" updates when they are released, whereas websites like that one (8notes.com) rank 1st making fun of all the mighty Panda all year around. Thoughts???!!!
Intermediate & Advanced SEO | | fablau0 -
Issue with Site Map - how critical would you rank this in terms of needing a fix?
A problem has been introduced onto our sitemap whereby previously excluded URLs are no longer being correctly excluded. These are returning a HTTP 400 Bad Request server response, although do correctly redirect to users. We have around 2300 pages of content, and around 600-800 of these previously excluded URLs, An example would be http://www.naturalworldsafaris.com/destinations/africa-and-the-indian-ocean/botswana/suggested-holidays/botswana-classic-camping-safari/Dates and prices.aspx (the page does correctly redirect to users). The site is currently being rebuilt and only has a life span of a few months. The cost our current developers have given us for resolving this is quite high with this in mind. I was just wondering: How much of a critical issue would you view this? Would it be sufficient (bearing in mind this is an interim measure) to change these pages so that they had a canonical or a redirect - they would however remain on the sitemap. Thanks
Intermediate & Advanced SEO | | KateWaite
Kate0 -
Question about Syntax in Robots.txt
So if I want to block any URL from being indexed that contains a particular parameter what is the best way to put this in the robots.txt file? Currently I have-
Intermediate & Advanced SEO | | DRSearchEngOpt
Disallow: /attachment_id Where "attachment_id" is the parameter. Problem is I still see these URL's indexed and this has been in the robots now for over a month. I am wondering if I should just do Disallow: attachment_id or Disallow: attachment_id= but figured I would ask you guys first. Thanks!0 -
How to Avoid Duplicate Content Issues with Google?
We have 1000s of audio book titles at our Web store. Google's Panda de-valued our site some time ago because, I believe, of duplicate content. We get our descriptions from the publishers which means a good
Intermediate & Advanced SEO | | lbohen
deal of our description pages are the same as the publishers = duplicate content according to Google. Although re-writing each description of the products we offer is a daunting, almost impossible task, I am thinking of re-writing publishers' descriptions using The Best Spinner software which allows me to replace some of the publishers' words with synonyms. I have re-written one audio book title's description resulting in 8% unique content from the original in 520 words. I did a CopyScape Check and it reported "65 duplicates." CopyScape appears to be reporting duplicates of words and phrases within sentences and paragraphs. I see very little duplicate content of full sentences
or paragraphs. Does anyone know whether Google's duplicate content algorithm is the same or similar to CopyScape's? How much of an audio book's description would I have to change to stay away from CopyScape's duplicate content algorithm? How much of an audio book's description would I have to change to stay away from Google's duplicate content algorithm?0 -
About robots.txt for resolve Duplicate content
I have a trouble with Duplicate content and title, i try to many way to resolve them but because of the web code so i am still in problem. I decide to use robots.txt to block contents that are duplicate. The first Question: How do i use command in robots.txt to block all of URL like this: http://vietnamfoodtour.com/foodcourses/Cooking-School/
Intermediate & Advanced SEO | | magician
http://vietnamfoodtour.com/foodcourses/Cooking-Class/ ....... User-agent: * Disallow: /foodcourses ( Is that right? ) And the parameter URL: h
ttp://vietnamfoodtour.com/?mod=vietnamfood&page=2
http://vietnamfoodtour.com/?mod=vietnamfood&page=3
http://vietnamfoodtour.com/?mod=vietnamfood&page=4 User-agent: * Disallow: /?mod=vietnamfood ( Is that right? i have folder contain module, could i use: disallow:/module/*) The 2nd question is: Which is the priority " robots.txt" or " meta robot"? If i use robots.txt to block URL, but in that URL my meta robot is "index, follow"0 -
Do I need to disallow the dynamic pages in robots.txt?
Do I need to disallow the dynamic pages that show when people use our site's search box? Some of these pages are ranking well in SERPs. Thanks! 🙂
Intermediate & Advanced SEO | | esiow20130 -
Effect duration of robots.txt file.
in my web site there is demo site in that also, index in Google but no need it now.so i have created robots file and upload to server yesterday.in the demo folder there are some html files,and i wanna remove all these in demo file from Google.but still in web master tools it showing User-agent: *
Intermediate & Advanced SEO | | innofidelity
Disallow: /demo/ How long this will take to remove from Google ? And are there any alternative way doing that ?0 -
Duplicate Content Issue
Why do URL with .html or index.php at the end are annoying to the search engine? I heard it can create some duplicate content but I have no idea why? Could someone explain me why is that so? Thank you
Intermediate & Advanced SEO | | Ideas-Money-Art0