Rogerbot directives in robots.txt
-
I feel like I spend a lot of time setting false positives in my reports to ignore.
Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
-
Yes, you can definitely use the robots.txt file to prevent Rogerbot from crawling pages that you don’t want to include in your reports. This approach can help you manage and minimize false positives effectively.
To block specific pages or directories from being crawled, you would add directives to your robots.txt file. For example, if you have certain page types that you’ve already set with meta noindex, you can specify rules like this:
User-agent: Rogerbot Disallow: /path-to-unwanted-page/ Disallow: /another-unwanted-directory/
This tells Rogerbot not to crawl the specified paths, which should reduce the number of irrelevant entries in your reports.
However, keep in mind that while robots.txt directives can prevent crawling, they do not guarantee that these pages won't show up in search results if they are linked from other sites or indexed by different bots.
Additionally, using meta noindex tags is still a good practice for pages that may occasionally be crawled but shouldn’t appear in search results. Combining both methods—robots.txt for crawling and noindex for indexing—provides a robust solution to manage your web presence more effectively.
-
Never mind, I found this. https://moz.com/help/moz-procedures/crawlers/rogerbot
-
@awilliams_kingston
Yes, you can use robots.txt directives to prevent Rogerbot from crawling certain pages or sections of your site, which can help reduce the number of false positives in your reports. By doing so, you can focus Rogerbot’s attention on the parts of your site that matter more to you and avoid reporting issues on pages you don't care about.Here’s a basic outline of how you can use robots.txt to block Rogerbot:
Locate or Create Your robots.txt File: This file should be placed in the root directory of your website (e.g., https://www.yourwebsite.com/robots.txt).
Add Directives to Block Rogerbot: You’ll need to specify the user-agent for Rogerbot and define which pages or directories to block. The User-agent directive specifies which web crawlers the rules apply to, and Disallow directives specify the URLs or directories to block.
Here’s an example of what your robots.txt file might look like if you want to block Rogerbot from crawling certain pages:
javascript
Disallow: /path-to-block/
Disallow: /another-path/
If you want to block Rogerbot from accessing pages with certain parameters or patterns, you can use wildcards:javascript
Disallow: /path-to-block/*
Disallow: /another-path/?parameter=
Verify the Changes: After updating the robots.txt file, you can use tools like Google Search Console or other site analysis tools to check if the directives are being applied as expected.Monitor and Adjust: Keep an eye on your reports and site performance to ensure that blocking these pages is achieving the desired effect without inadvertently blocking important pages.
By doing this, you should be able to reduce the number of irrelevant or false positive issues reported by Rogerbot and make your reporting more focused and useful.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Zero '0' Total Visits
Hi. One of the properties in our account has been reporting zero '0' total visits for the past few weeks. The other properties aren't affected. Is there a reason for this or is this an issue on the Moz side of things. Thanks!Moz Zero Visits.PNG
Reporting & Analytics | | rh-digi0 -
Unsolved Google Analytics (GA4) recommendations for SEO analysis?
Guides on Moz and elsewhere mostly refer to Google Analytics' Universal Analytics (UA). However, UA is being replaced with GA4, and the interface, options, and reporting are very different. Can you recommend a clear, thorough, and effective walkthrough of how to set up useful SEO reports in GA4? Is there a simple tool you recommend that will help connect historical data from UA to GA4 when GA4 is the only option available? If there's no simple tool, what values do you recommend retaining from UA for effective historical reporting? How would you use them? At minimum for reporting, I'd want to show month-to-month changes and year-to-year changes (in percentages and in real numbers) for the following: all site visits all organic visits organic visits as a percentage of all site visits organic visits that led to a specific goal completion organic visits that led to any goal completion Thanks in advance for your help!
Reporting & Analytics | | Kevin_P1 -
How do you report SEO audit findings?
Hello, Mozzers! I'm curious to know how you report SEO audit findings. Do you use a spreadsheet? A presentation? A formal report? Or maybe something else. If you have a favourite audit template, I'd love to see it. A second question: what things do you report in an audit? I currently report crawl findings, authority and trust, link profiles, and competitive analysis. I also investigate a site's security—that's not usually part of an audit, but site owners need to know about it. What do you report to your audit customers? Thanks for sharing your auditing wisdom!
Intermediate & Advanced SEO | | AndyKubrin0 -
How do I exclude fake direct load traffic from networks in Google Analytics?
Starting on Friday 1/20, we noticed a huge, unnatural spike in Direct Load traffic. While researching where it was coming from, the big flags were huge spikes in countries that normally only have <5 sessions a month like Russia, Singapore, Brazil, etc., each sending 1400 a week, with >99% bounce rate and <0:00:05 average session duration. While looking into networks, we saw an influx in Networks that had never sent traffic before, each with >1300 sessions a week, 100% bounce rate, and 0:00:00 session duration. The list of these Networks are: astute hosting usa incorporated
Reporting & Analytics | | ServiceMichael
nephoscale inc.
network transit holdings llc
serverbeach
coreix ltd
2ezhost llp
nforce entertainment b.v.
mir telematiki ltd
servers australia pty ltd wholesale services provider for abuse
reliablehosting
dimenoc servicos de informatica ltda
c0715718213 I have seen a lot of guides of filtering out Referral traffic, but these are all coming in as Direct Load and are skewing our Direct Load results. Any idea how to filter or remove this traffic from Google Analytics?0 -
Direct / (none) Spam Traffic Help
In July 2015, we experienced an over 1,000% increase in traffic and it has remained like that ever since. It's all spam traffic and I have no clue how to get rid of it. I added in your typical .htaccess blocks from known culprits with little to no effect. Read up on Ghost traffic and applied filters to no effect. The spam is completely distributed as far as I can tell both geographically as well as by network providers. Where once we had pretty decent bounce rates of around 50%, now, since all my Analytics data is meaningless - it's around 90%. I could apply a filter but beyond my GA account providing no insights, I'm also concerned about the increased use of server resources. I'd ideally like to stop the traffic completely. The only distinguishing feature of the traffic that I have been able to determine is browser size. Comparing June 2015 to July 2015 we saw the following: Browser size visits: 620 x 460 = 6,828 vs 0, 610 x 450 = 175 vs 0, 1330 x 630 = 71 vs 1, 1890 x 940 = 67 vs 0, 780 x 580 = 58 v 5. Other than that, I can find no unifying theme to the traffic beyond being traffic hitting our homepage and having no medium. Nothing special that I am aware of happened in July. We didn't do any sort of...really anything. We did have our network compromised by ransomware in the beginning of June, which we promptly ignored and restored backups - at no point did we try to contact the criminals, but I am doubtful there is any connection considering that our website is remotely hosted. If anyone has any suggestions or has seen anything like this before, please let me know. spam-traffic.jpg
Reporting & Analytics | | Nivik230 -
Organic and direct traffic swap
We moved to a CMS (Webhook) and when we did that organic traffic and direct traffic swapped places. Since we moved it organic traffic is down by about 400 visits and direct traffic is up by 400 visits. I went through the list below and confirmed everything is working. The http refer wasn't being passed for a couple of weeks but the issue was resolved and the organic traffic issue is still ongoing. Is there anything else that may cause this issue? I confirmed the issue isn't one of the below problems. during http to https redirect (or vice versa) the referrer may not be passed incorrect subdomain or cross-domain tracking can strip the referrer. 302 redirects sometimes caused the referrer to be dropped problems with cookies being lost/corrupted. javascript missing from certain entry pages (means any further page view looks like a direct)
Reporting & Analytics | | BT20090 -
How to get a list of robots.txt file
This is my site. http://muslim-academy.com/ Its in wordpress.I just want to know is there any way I can get the list of blocked URL by Robots.txt In Google Webmaster its not showing up.Just giving the number of blocked URL's. Any plugin or Software to extract the list of blocked URL's.
Reporting & Analytics | | csfarnsworth0 -
Strange increase in Direct traffic in Google analytics
For past 2 weeks, several of our sites have strange increase in direct traffic in Google Analytics. we also have another tracking code, and in this account we don't have any big changes, so this is very strange what is happening. We didn't changed any codes, and none of the changes were done to application. Any ideas why this is happening? z7ME9.jpg
Reporting & Analytics | | InformMedia0