Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How Can I Rank My Website Quickly and get traffic 20k per months
Hello moz webmasters, PLZ tell me How Can I Rank My Website Quickly and get traffic 20k per months. if you have backlinks lists of edu and gov sites plz donate me. check my site https://www.steemseo.com [Link removed by a forum moderator.]
Intermediate & Advanced SEO | | tushartosi0 -
Changing URLS: from a short well optimised URL to a longer one – What's the traffic risk
I'm working with a client who has a website that is relatively well optimised, thought it has a pretty flat structure and a lot of top level pages. They've invested in their content over the years and managed to rank well for key search terms. They're currently in the process of changing CMS and as a result of new folder structuring in the CMS the URLs for some pages look to have significantly changed. E.g Existing URL is: website.com/grampians-luxury-accommodation which ranked quite well for luxury accommodation grampians New URL when site is launched on new CMS would be website.com/destinations/victoria/grampians My feeling is that the client is going to lose out on a bit of traffic as a result of this. I'm looking for information or ways or case studies to demonstrate the degree of risk, and to help make a recommendation to mitigate risk.
Intermediate & Advanced SEO | | moge0 -
Technical SEO Issues - Traffic Drop
Hi guys, I hope you're all doing well! We're a small personalised gifts company who specialise in the provision of phone cases, mugs, macbook covers and the like. I head up the Digital Marketing but have little experience in the technical side of SEO and have very limited resources in terms of budget and staffing. Over the past few months, I've been working on stripping down the thin content on the site, fixing duplicate content issues and focusing on other digital channels to boost revenue. However, as of recent we've noticed a significant drop in traffic and our rankings. I've tried to diagnose the problem and I'm convinced there are some technical SEO fixes that need to be implemented. Our website is www.mrnutcase.com If any of you have any ideas, I'd love to hear some of them. Greatly appreciated, Danny
Intermediate & Advanced SEO | | DannyNutcase0 -
WordPress – parent category "blog" instead of regular "post page"?
In WordPress you normally show you blog posts on: Your home page. Your "posts page" (configurable in the Reading Settings) I want to do neither and have a third option instead: Assign a parent category called "blog" for all posts, and show the latest posts on that category's archive page. For the readers, the experience will be 100% the same as a regular "posts page". The UI, permalinks, and breadcrumbs will be 100% the same. But, I have heard that the "posts page" is important for Google for indexing and understanding your blog. So is is smarter SEO-wise to use a "posts page" instead of a parent category named "blog"? What negative effects might there be, if I have no "posts page" and just use the parent category "blog" instead?
Intermediate & Advanced SEO | | NikolasB0 -
Traffic drop after Facebook push
Hi all, We experienced a strange phenonema after a Facebook push, it appears the Google organic traffic was all but dead for five days after. Totally not sure why! It has since returned to about 80% of previous levels. http://postimg.org/image/3n1b7m7hf/
Intermediate & Advanced SEO | | ScottOlson0 -
Site re-design, full site domain A/B test, will we drop in rankings while leaking traffic
We are re-launching a client site that does very well in Google. The new site is on a www2 domain which we are going to send a controlled amount of traffic to, 10%, 25%, 50%, 75% to 100% over a 5 week period. This will lead to a reduction in traffic to the original domain. As I don't want to launch a competing domain the www2 site will not be indexed until 100% is reached. If Google sees the traffic numbers reducing over this period will we drop? This is the only part I am unsure of as the urls and site structure are the same apart from some new lower level pages which we will introduce in a controlled manner later? Any thoughts or experience of this type of re-launch would be much appreciated. Thanks Pete
Intermediate & Advanced SEO | | leshonk0 -
Lost 86% of traffic after moving old static site to WordPress
I hired a company to convert an old static website www.rawfoodexplained.com with about 1200 pages of content to WordPress. Four days after launch it lost almost 90% of traffic. It was getting over 60,000 uniques while nobody touched the site for several years. It’s been 21 days since the WordPress launch. I read a lot of stuff prior to moving it (including Moz's case study) and I was expecting to lose in short term 30% of traffic max… I don’t understand what is wrong. The internal link structure is the same, every url is 301 to the same url only without[dot]html (ie www.rawfoodexplained.com/science.html is 301′s to http://www.rawfoodexplained.com/science/ ), it’s added to Google Webmaster tool and Google indexed the new pages… Any ideas what could be possible wrong? I do understand the website is not optimized (meta descriptions etc, but it wasn't before either) .... Do you think putting back the old site would recover the traffic? I would appreciate any thoughts Thank you
Intermediate & Advanced SEO | | JakubH0 -
Are sites that recently lost traffic considered bad neighborhood?
Hi, While searching for co-operations and guest blogging opportunities I found several sites that look legit and that lost all SE traffic (according to various tools such as SEMRush). By legit I mean sites that have PR3 and up. Have many solid looking articles and that the articles are on the site's subject and that the articles do not necessarily point to other sites with exact match anchors. Should I post in these sites or are they probably penalized and therefore being there put me at risk? Thanks
Intermediate & Advanced SEO | | BeytzNet0