Site with 2 domains - 1 domain SEO opimised & 1 is not. How best to handle crawlers?
-
Situation:
I have a dual domain site:
Domain 1 - www.domain.com is SEO optimised with product pages and should of course be indexed.
Domain 2 - secure.domain.com is not SEO optimised and simply has checkout and payment gateway pages.I've discovered that Moz automatically crawls Domain 2 - the secure.domain.com site and consequently picks up hundreds of errors.
I have put an end to this by adding a robots.txt to stop rogerbot and dotbot (mozs crawlers) from crawling domain 2. This fixes my errors in Moz reports however after doing more research into 'Crawler Control' I figure this might be the best option.
My Question:
Instead of using robots.txt to stop moz from crawing all of Domain 2 should I use on each page of domain 2?
I believe this would then allow moz and google to crawl Domain 2 but also tell them both not to index it.
My understanding is that this would be best, and might even help my overall SEO by telling google not to give any SEO value to the Domain 2 pages? -
Hello!
I can answer this from a Google / SEO perspective (a non-moz tool perspective).
First you want to be sure the secure subdomain content is not indexed.
-
If the secure subdomain is NOT indexed, leave the robotos.txt crawl blocking in place. You don't want and don't need Google crawling secure pages and payment pages. Just be sure they truly all are private pages. If they are NOT indxed, the crawl block is best - this will prevent google from crawling, and if they can't crawl they can't index.
-
If the secure pages ARE indexed
-
remove the robots.txt crawl block.
-
Add meta noindex on all the pages
-
Wait for them to be noindexed (removed from google)
-
Then, block them from being crawled with robots.txt - which will prevent them from being crawled, and thus prevent them from being indexed as well.
-
-
Hey, Dave here from the Help Team!
Jumping in to answer the technical question, you can definitely use the meta robots tag instead of a disallow directive in your robots.txt file. I would like to point out that Meta Noindex is something we report in Site Crawl so you would see an influx in that issue category but you can mark them as "ignored" as you see fit.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Crawler was not able to access the robots.txt
I'm trying to setup a campaign for jessicamoraninteriors.com and I keep getting messages that Moz can't crawl the site because it can't access the robots.txt. Not sure why, other crawlers don't seem to have a problem and I can access the robots.txt file from my browser. For some additional info, it's a SquareSpace site and my DNS is handled through Cloudflare. Here's the contents of my robots.txt file: # Squarespace Robots Txt User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: FacebookBot User-agent: Claude-Web User-agent: cohere-ai User-agent: PerplexityBot User-agent: Applebot-Extended User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps User-agent: * Disallow: /config Disallow: /search Disallow: /account$ Disallow: /account/ Disallow: /commerce/digital-download/ Disallow: /api/ Allow: /api/ui-extensions/ Disallow: /static/ Disallow:/*?author=* Disallow:/*&author=* Disallow:/*?tag=* Disallow:/*&tag=* Disallow:/*?month=* Disallow:/*&month=* Disallow:/*?view=* Disallow:/*&view=* Disallow:/*?format=json Disallow:/*&format=json Disallow:/*?format=page-context Disallow:/*&format=page-context Disallow:/*?format=main-content Disallow:/*&format=main-content Disallow:/*?format=json-pretty Disallow:/*&format=json-pretty Disallow:/*?format=ical Disallow:/*&format=ical Disallow:/*?reversePaginate=* Disallow:/*&reversePaginate=* Any ideas?
Getting Started | | andrewrench0 -
Best use of Moz Pro
Hello, I want to ask what are the benefits of Moz Pro other than finding low competitive keywords? I want to know the best use of Moz Pro for betterment of my site (chicken grill recipe). I want the best possible result. Guide me. Thanks.
Getting Started | | maxcharles0 -
What is a Good Domain Authority Score?
I am wondering what is a Good Domain Authority Score? This is my website https://gopaintsprayer.com currently DA and PA are quite high, but the traffic is decreasing, I don't understand the reason, you can help. Thank you !
Getting Started | | gogoanimetp0 -
Moz not able to crawl our site - any advice?
When I try and crawl our site through Moz it gives this message: Moz was unable to crawl your site on Aug 7, 2019. Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. I have been through all the help and doesn't seem to be any issues. You can check the site and robots.txt here: https://myfamilyclub.co.uk/robots.txt. Anyone got any advice on where I could go to get this sorted?
Getting Started | | MyFamilClubLtd1 -
My website does not allow all crawler to crawl, Now my question is that whether i need to give permission to moz crawler if yes then whaat is moz bot name?
My website does not permit all crawler to crawl website. Whether ii need to give permission to moz bot to crawl website or not? If yes what is the moz bot name?
Getting Started | | irteam0 -
How to authenticate Moz crawler so that others don't use Rogerbot useragent to scrape data from our site?
Is there any way to authenticate genuine Moz crawler. Because, our website keeps getting scrapping attacks and if there is no way to authenticate Moz crawler, then, any scraper can just set user agent as Rogerbot and scrape all our pages. Is there a fixed IP that can be used or any other customization that will help us authenticate and allow only Moz crawler to crawl our site. Looking forward to a solution to this problem. We haven't been able to use Moz crawler due to this issue.
Getting Started | | longclimber0 -
I am managing an existing ecommerce website and just subscribed to the MOZ tools - what is the best rout to learning how bets leverage all the tools to optimize my site?
I am managing an existing ecommerce website and just subscribed to the MOZ tools - what is the best rout to learning how bets leverage all the tools to optimize my site?
Getting Started | | DiveNSail0 -
How to get moz to crawl a staging domain that is blocked by robots.txt
Is it possible to get Moz to do a crawl report on a domain blocked by robots.txt and actually display all errors instead of only one saying the domain was blocket in robots.txt? Anything i can add to robots.txt to make moz able to do the crawl report but still hinder google from crawling a staging domain?
Getting Started | | classifiedtech0