Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Big problems with site traffic
Hello! I have big problems with website promotion. It's been 7 months and the attendance on the site is 1-5 people a day. I do not understand the reason. Can you tell me what I'm doing wrong? Site: www.azartlist.com Many thanks.
Intermediate & Advanced SEO | | Bobic1 -
Onpage Reviews, SEO & Traffic Uplift
Hi I wondered if anyone knew of any case studies to reinforce the importance of on page reviews for SEO & increasing traffic. I'd like to push it in my company, however it would be great to show them some results from a case study. Thank you!
Intermediate & Advanced SEO | | BeckyKey1 -
Making Filtered Search Results Pages Crawlable on an eCommerce Site
Hi Moz Community! Most of the category & sub-category pages on one of our client's ecommerce site are actually filtered internal search results pages. They can configure their CMS for these filtered cat/sub-cat pages to have unique meta titles & meta descriptions, but currently they can't apply custom H1s, URLs or breadcrumbs to filtered pages. We're debating whether 2 out of 5 areas for keyword optimization is enough for Google to crawl these pages and rank them for the keywords they are being optimized for, or if we really need three or more areas covered on these pages as well to make them truly crawlable (i.e. custom H1s, URLs and/or breadcrumbs)…what do you think? Thank you for your time & support, community!
Intermediate & Advanced SEO | | accpar0 -
Normal that Home Page Generating Less than 4% Of Organic Traffic?
Greetings MOZ Community: My firm operates www.nyc-officespace-leader.com, a commercial real estate brokerage in New York City. Prior to the first Penguin update in April 2012, our home page used to receive about 10% or 600 of total organic visits. After the first Penguin was launched by Google organic traffic to the home dropped to maybe 5% or 200 visits per month. Since May of this year, it appears we have been penalized by Penguin 4.0 and are attempting to recover. Now our home page only generates about 140 organic visits per month, or less than 4% of organic traffic. Our home enjoyed good conversion rate, so this drop in traffic is a real loss. Does this very low level of traffic to the home page indicate something abnormal? Dropping from 10% to less than 4% is a major decline. Should we take specific steps regarding the home page like enhancing the content? Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
90% Traffic Drop...
Hey Moz Community Our team has been racking our collective brains about a 90% drop in traffic a niche client of ours has seen. The traffic drop occurred September 20th. The site averages 300-400 unique visits a month (very targeted & niche) & dropped to 30-40 uvm. The domain is on an exact match but other then that I can't see anything that would be deserving of such a significant drop in traffic. Our campaign had been performing very well & we were seeing steady gains in traffic & ranking over the long term until September 20th. I'm curious if the exact match update would have such a big impact months after it was released (July/July if I remember correctly). I also didn't think it could be such a big issue because that domain has been used by us for a couple years. Is the Exact Domain Match such a big deal? eJGe3Vw
Intermediate & Advanced SEO | | hendersondavidp0 -
I need help with a local tax lawyer website that just doesn't get traffic
We've been doing a little bit of linkbuilding and content development for this site on and off for the last year or so: http://www.olsonirstaxattorney.com/ We're trying to rank her for "Denver tax attorney," but in all honesty we just don't have the budget to hit the first page for that term, so it doesn't surprise me that we're invisible. However, my problem is that the site gets almost NO traffic. There are days when Google doesn't send more than 2-3 visitors (yikes). Every site in our portfolio gets at least a few hundred visits a month, so I'm thinking that I'm missing something really obvious on this site. I would expect that we'd get some type of traffic considering the amount of content the site has, (about 100 pages of unique content, give or take) and some of the basic linkbuilding work we've done (we just got an infographic published to a few decent quality sites, including a nice placement on the lawyer.com blog). However, we're still getting almost no organic traffic from Google or Bing. Any ideas as to why? GWMT doesn't show a penalty, doesn't identify any site health issues, etc. Other notes: Unbeknownst to me, the client had cut and pasted IRS newsletters as blog posts. I found out about all this duplicate content last November, and we added "noindex" tags to all of those duplicated pages. The site has never been carefully maintained by the client. She's very busy, so adding content has never been a priority, and we don't have a lot of budget to justify blogging on a regular basis AND doing some of the linkbuilding work we've done (guest posts and infographic).
Intermediate & Advanced SEO | | JasonLancaster0 -
Issues with Google-Bot crawl vs. Roger-Bot
Greetings from a first time poster and SEO noob... I hope that this question makes sense... I have a small e-commerce site, I have had Roger-bot crawl the site and I have fixed all errors and warnings that Volusion will allow me to fix. Then I checked Webmaster Tools, HTML improvements section and the Google-bot sees different dupe. title tag issues that Roger-bot did not. so A few weeks back I changed the title tag for a product, and GWT says that I have duplicate title tags but there is only one live page for the product. GWT lists the dupe. title tags, but when I click on each they all lead to the same live page. I'm confused, what pages are these other title tags referring to? Does Google have more than one page for that product indexed due to me changing the title tag when the page had a different URL? Does this question make sense? 2) Is this issue a problem? 3) What can I do to fix it? Any help would be greatly appreciated Jeff
Intermediate & Advanced SEO | | IOSC0 -
Linking Strategies 2013 - Regular Routines/ Tips
So, what is the latest linking strategies and the best practices for the new year? I'm looking to start a clean, fresh website and would love to implement a great new strategy or tactics that really work with the right amount of effort. Is there a guide available? Preferably from website on-page optimization all the way to a regular routine of what to do. I know it won't be easy, but that's why I love the ever changing world of SEO!
Intermediate & Advanced SEO | | Paul_Tovey0