Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
-
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like:
staging.domain.com
User-agent: *
Disallow: /in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.
-
Just make sure that when/if you copy over the staging site to the live domain that you don't copy over the robots.txt, htaccess, or whatever means you use to block that site from being indexed and thus have your shiny new site be blocked.
-
I agree. The name of your subdomain being "staging" didn't register at all with me until Matt brought it up. I was offering a generic response to the subdomain question whereas I believe Matt focused on how to handle a staging site. Interesting viewpoint.
-
Matt/Ryan-
Great discussion, thanks for the input. The staging.domain.com is just one of the domains we don't want indexed. Some of them still need to be accessed by the public, some like staging could be restricted to specific IPs.
I realize after your discussion I probably should have used a different example of a sub-domain. On the other hand it might not have sparked the discussion so maybe it was a good example
-
.htaccess files can be placed at any directory level of a site so you can do it for just the subdomain or even just a directory of a domain.
-
Staging URL's are typically only used for testing so rather than do a deny I would recommend using a specific ALLOW for only the IP addresses that should be allowed access.
I would imagine you don't want it indexed because you don't want the rest of the world knowing about it.
You can also use HTACCESS to use username/passwords. It is simple but you can give that to clients if that is a concern/need.
-
Correct.
-
Toren, I would not recommend that solution. There is nothing to prevent Googlebot from crawling your site via almost any IP. If you found 100 IPs used by the crawler and blocked them all, there is nothing to stop the crawler from using IP #101 next month. Once the subdomain's content is located and indexed, it will be a headache fixing the issue.
The best solution is always going to be a noindex meta tag on the pages you do not wish to be indexed. If that method is too much work or otherwise undesirable, you can use the robots.txt solution. There is no circumstance I can imagine where you would modify your htaccess file to block googlebot.
-
Hi Matt.
Perhaps I misunderstood the question but I believe Toren only wishes to prevent the subdomain from being indexed. If you restrict subdomain access by IP it would prevent visitors from accessing the content which I don't believe is the goal.
-
Interesting, hadn't thought of using htaccess to block Googlebot.Thanks for the suggestion.
-
Thanks Ryan. So you don't see any issues with de-indexing the main site if I created a second robots.txt file, e.g.
http://staging.domin.com/robots.txt
User-agent: *
Disallow: /That was my initial thought but when Google announced they consider sub-domains part of the TLD I was afraid it might affect the htp://www.domain.com versions of the pages. So you're saying the subdomain is basically treated like a folder you block on the primary domain?
-
Use an .htaccess file to only allow from certain ip addresses or ranges.
Here is an article describing how: http://www.kirupa.com/html5/htaccess_tricks.htm
-
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Place a robots.txt file in the root of the subdomain.
User-agent: *
Disallow: /This method will block the subdomain while leaving your primary domain unaffected.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google is still indexing the old domain a year after 301 redirects are put in place
Hi there, You might have experienced this before but for me this is the first. A client of mine moved from domain A (www.domainA.com) to domain B (www.domainB.com). 301 redirects are all in place for over a year. But the old domain is still showing in Google when you search for "site:domainA.com" The HTTP Header check shows this result for the URL https://www.domainA.com/company/cookie-policy.aspx HTTP/1.1 301 Moved Permanently =>
Technical SEO | | iQi
Cache-Control => private
Content-Length => 174
Content-Type => text/html; charset=utf-8
Location => https://www.domain_B_.com/legal/cookie-policy
Server => Microsoft-IIS/10.0
X-AspNetMvc-Version => 5.2
X-AspNet-Version => 4.0.30319
X-Powered-By => ASP.NET
Date => Fri, 15 Mar 2019 12:01:33 GMT
Connection => close Does the redirect look wrong? The change of address request was made on Google Console when the website was moved over a year ago. Edit: Checked the domainA.com on bing and it seems that its not indexed, and replaced with domainB.com, which is the right. Just Google is indexing the old domain! Please let me know your thoughts on why this is happening. Best,0 -
How can I get a photo album indexed by Google?
We have a lot of photos on our website. Unfortunately most of them don't seem to be indexed by Google. We run a party website. One of the things we do, is take pictures at events and put them on the site. An event page with a photo album, can have anywhere between 100 and 750 photo's. For each foto's there is a thumbnail on the page. The thumbnails are lazy loaded by showing a placeholder and loading the picture right before it comes onscreen. There is no pagination of infinite scrolling. Thumbnails don't have an alt text. Each thumbnail links to a picture page. This page only shows the base HTML structure (menu, etc), the image and a close button. The image has a src attribute with full size image, a srcset with several sizes for responsive design and an alt text. There is no real textual content on an image page. (Note that when a user clicks on the thumbnail, the large image is loaded using JavaScript and we mimic the page change. I think it doesn't matter, but am unsure.) I'd like that full size images should be indexed by Google and found with Google image search. Thumbnails should not be indexed (or ignored). Unfortunately most pictures aren't found or their thumbnail is shown. Moz is giving telling me that all the picture pages are duplicate content (19,521 issues), as they are all the same with the exception of the image. The page title isn't the same but similar for all images of an album. Example: On the "A day at the park" event page, we have 136 pictures. A site search on "a day at the park" foto, only reveals two photo's of the albums. 3QolbbI.png QTQVxqY.jpg mwEG90S.jpg
Technical SEO | | jasny0 -
Is it better to use XXX.com or XXX.com/index.html as canonical page
Is it better to use 301 redirects or canonical page? I suspect canonical is easier. The question is, which is the best canonical page, YYY.com or YYY.com/indexhtml? I assume YYY.com, since there will be many other pages such as YYY.com/info.html, YYY.com/services.html, etc.
Technical SEO | | Nanook10 -
Getting a video displaying a lightbox indexed
We have created a video for a category page with the goal of building links to the page and improving the conversion rate of visitors to the page. This category is Christmas oriented so we want to get the video dropped in ASAP. Unfortunately there was a mixup with our developer and he created a lightbox pop-up to display the video on the category page. I'm concerned this will hurt our ability to get the video indexed in Google. Here was his response. Is what he says here true? "With the video originally being in lightbox the iFrame Embed was enough since the video can't be on the page, it would have to be hidden on the page which is ignored by Google. The SEO would be derived from modifying the video sitemap to define the category page as the HTML page for the Wistia video and Google will make the association. The sitemap did all the heavy lifting, the schema markup did not come till later so it had no additional affect on Google other then to re-enforce the sitemap." Thanks for your help!
Technical SEO | | GManSEO0 -
Domains
My questions is what to do with old domains we own from a past business. Is it advantages to direct them to the new domain/company or is that going to cause a problem for the new company. They are not in the same industry.
Technical SEO | | KeylimeSocial0 -
I can buy a domain from a competitor. Whats the best way to make good use of these links for my existing website
I can buy a domain from a competitor. Whats the best way to make good use of these links for my existing website
Technical SEO | | Archers0 -
.com or .co.uk in UK index? but the .com has higher domain authority...
Hi there i have a .com and a .co.uk for a site that has been around a while. However not much seo has been done on it, i was wonderign do i continue to optimise for the .com or switch to the .co.uk to rank in Google UK index for various search terms. .COM = 40 domain authority .CO.UK - 10 domain authority. Let the debate start 🙂
Technical SEO | | pauledwards0 -
How do I redirect index.html to the root / ?
The site I've inherited had operated on index.html at one point, and now uses index.php for the home page, which goes to the / page. The index.html was lost in migrating server hosts. How do I redirect the index.html to the / page? I've tried different options that keep giving ending up with the same 404 error. I tried a redirect from index.html to index.php which ended in an infinite loop. Because the index.html no longer exists in the root, should I created it and then add a redirect to it? Can I avoid this by editing the .htaccess? Any help is appreciated, thanks in advance!
Technical SEO | | NetPicks0