How do I disallow crawl on a directory when it's a prefix to my site's URL?
-
I am trying to disallow our media repository (hosted elsewhere, but appears as a directory on our site) from being crawled by robots but it is not a subdirectory of the site, it's a prefix.
So I need to disallow: mediabank.mywebsite.org
Not: mysite.org/mediabank
What would I need to put in my robots.txt and/or the other host's robots.txt to make this happen?
Thanks!
-
Hey there! Tawny from Moz's Help Team here.
You'll want to add a robots.txt file for that subdomain, and then add a Disallow command to that robots.txt file. So, using your example, you'd want a file like mediabank.mywebsite.org/robots.txt that had a Disallow command for any robots you don't want crawling that subdomain.
For all user-agents, that would look something like this:
User-agent: *
Disallow: /That would stop any user-agents from crawling any pages on that subdomain.
I hope this helps! If you've still got questions, feel free to send us a note at help@moz.com and we'll do our best to sort things out for you.
-
Hi,
Please check this old thread on the same topic @ https://moz.com/community/q/block-an-entire-subdomain-with-robots-txt
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz claims we have meta noindex but we don't
Hi, I'm encountering an issue where moz scan says we have meta noindex, but I have confirmed across several of our pages that this simply isn't true. I have confirmation that the below tag is present: name="robots" content="index, follow" /> I also verified our https header through https://www.webconfs.com/http-header-check.php and see nothing indicating that we are sending any no index headers. Why would the crawler report this when it doesn't seem to be the case? Let me know if I need to provide more information.
Moz Bar | | charper_floqast0 -
Page Grader states "includes Canonical Tag" but it's not in the page source at all
I've ran it multiple times and changed other things it picked up on so not sure where it's getting the canonical tag is included even though it isn't?
Moz Bar | | Wana-Ryd0 -
Site crawl warning - concatenated urls from Wordpress
I could use some help on how to fix this. I asked at the walkthrough but was told it was a Wordpress issue but so far I can't find anything to point me in the right direction. There are no errors in the files on server side and I have asked my hosting company too. I am hoping someone here may be able to shed some light on it. One of my websites it giving 404 errors on links that are formed as below and there are over 12.7K of them! Example: <mydomainurl>/www.instagram.com/www.instagram.com/<instagram username=""></instagram></mydomainurl> The link that relates to my website is valid and working, but I don't understand the rest. I am totally stumped on how to move forward with this. Any advice, suggestions, tips on how to fix these errors and stop these types of links getting generated. Thanks.
Moz Bar | | emercarr0 -
Site Crawl report show strange duplicate pages
Beginning in early in Feb, we got a big bump in duplicate pages. The URLs of the pages are very odd: Example URL:
Moz Bar | | Neo4j
http://firstname.lastname@website.com/dir/page.php
is duplicate with http://website.com/dir/page.php I checked though the site, nginx conf files, and referral pages, and could not find what is prefixing the pages with 'http://firstname.lastname@'. Any ideas? The person whose name is 'Firstname Lastname' is stumped as well. Thanks.0 -
Why do my search results differ from MOZ's rank tracker
This is starting to happen a lot, i mean they weren't always an exact match but they differed by a few places. But now the gap between results I'm getting and MOZ's own rank tracker is quite large. For my keyword my page ranks on MOZ at 39 (it was 25 but has slipped down). Im seeing my page on page 1 locally and page 2 in incognito mode. Now I understand there are other factors such as browser history, cookies, am i logged into gmail etc. Thats why I asked colleagues to use Internet explorer and they have nothing to do with SEO so their history wont affect the search. They report seeing it on page 2, even colleagues in a different office in a different city sees it on page 2. I want to contact the department in question and share the good news that they've gone from none existent to 14th in what is a very competitive area. But MOZ's result has be second guessing whether I should. Any ideas why the gap between results is so large? Thanks
Moz Bar | | Brabian0 -
Is Manual Crawl Test option available now to Pro Users?
Hi all, I have worked on my Crawl Issues and want to see how many still exist. Earlier I was using Manual Crawl Test. However, now I don't see this tool in Moz Account. Please suggest. Thanks
Moz Bar | | chandman0 -
Moz Dupe content crawl anomaly
Hi Moz has completed a crawl for a site i'm working on which also has a development area (hence with lots of dupe content) on a sub domain (and this dev area hasn't been hidden from crawlers via password, robots, gwt etc etc). Moz dupe content report is not showing any of these urls though even though my campaign setting is on 'root' domain so i would have thought report should be listing the subdomain urls as dupe content (because they are dupe content). Any ideas ? Cheers Dan
Moz Bar | | Dan-Lawrence0 -
What happened to moz Crawl Test? Is it moved in the redesign?
Love the new design of moz.com! And the mentions tool is terrific. However, I use the Crawl test a ton and cannot find where to access it now.
Moz Bar | | jessential0