Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is surfacing top blog posts with read more link could create a boost in traffic to main domain?
Hi mozzers, Because our blog is located on blog.example.com on powered by Wordpress and currently can't migrate it to the main domain, unfortunately. Since we would like to grow our main's domain organic traffic and would like to test an option that could help us leverage the traffic of the top blog posts content. There is a Wordpress API that would allow us to get 100-200 words(snippet of the blog post) from the blog posts into the main domain that would provide a "Read more link" linking back to the blog.
Intermediate & Advanced SEO | | Ty1986
Is this even a good idea assuming we would make sure content is not identical?0 -
How Can I Rank My Website Quickly and get traffic 20k per months
Hello moz webmasters, PLZ tell me How Can I Rank My Website Quickly and get traffic 20k per months. if you have backlinks lists of edu and gov sites plz donate me. check my site https://www.steemseo.com [Link removed by a forum moderator.]
Intermediate & Advanced SEO | | tushartosi0 -
Community Discussion - Are annotations an overlooked avenue for driving traffic from YouTube?
Happy Friday, Q&A friends! This week's discussion question comes from Amir Jaffari's YouMoz post from Thursday, January 28. Amir explains how he was able to use annotations to massively increase his YouTube videos' views without using ads, and raise his annotation CTR 22,400%. Have you tested annotations to see if they dramatically improve conversion rates? What were the results? What other strategies have you tried?
Intermediate & Advanced SEO | | MattRoney3 -
Will redirecting poor traffic web pages increase web presence
A number of pages on my site have low traffic metrics. I intend to redirect poor performing pages to the most appropriate page with high traffic. Example
Intermediate & Advanced SEO | | Mark_Ch
www.sampledomomain.co.uk/low-traffic-greyshoes
www.sampledomomain.co.uk/low-traffic-greenshoes
www.sampledomomain.co.uk/low-traffic-redshoes all of the above will be redirected to the following page:
www.sampledomomain.co.uk/high-traffic-blackshoes Question
Will carrying out htaccess redirects from the above example influence to web positioning of both www.sampledomomain.co.uk/high-traffic-blackshoes and www.sampledomomain.co.uk Regards Mark0 -
Best way to handle traffic from links brought in from old domain.
I've seen many versions of answers to this question both in the forum, and throughout the internet... However, none of them seem to specifically address this particular situation. Here goes: I work for a company that has a website (www.example.com) but has also operated under a few different names in the past. I discovered that a friend of the company was still holding onto one of the domains that belonged to one of the older versions of the company (www.asample.com) and he was kind enough to transfer it into our account. My first reaction was to simply 301 redirect the older to the newer. After I did this, I discovered that there were still quite a few active and very relevant links to that domain, upon reporting this to the company owners they were suddenly concerned that a customer may feel misdirected by clicking www.asample.com and having www.example.com pop up. So I constructed a single page on the old domain that explained that www.asample.com was now called www.example.com and provided a link. We recently did a little house cleaning and moved all of our online holdings "under one roof" so to speak, and when the rep was going over things with the owners began to exclaim that this was a horrible idea, and that domain should instead be linked to it's own hosting account, and wordpress (or some other CMS) should be installed, and a few pages of content about the companies/subject should be posted. So the question: Which one of these is the most beneficial to the site and the business that are currently operating (www.example.com?) I don't see a real problem with any of these answers, but I do see a potentially un-needed expense in the third solution if a simple 301 will bring about the most value. Anyone else dealt with a situation like this?
Intermediate & Advanced SEO | | modulusman0 -
Big drop in traffic on 18 Oct
Hi mozzers, One of my client’s sites has a big drop in traffic. The site is topdealshotel.com. The drop was in 18 Oct and I know that there weren’t any algo changes. The site has 20+ language versions all optimized properly for all languages. In june I made some big changes in site’s structure and I also changed all URLs. After these new improvements the organic traffic started to grow naturally and very good, but dropped drastically in last 30 days. Unfortunately I couldn’t 301 old URLs into new ones and in WMT at crawl stats I have almost 2 million 404 errors. The site has many hotel pages with content that is from hotelscombined.com, but we also add our own content for all pages. In city pages we have unique content, written by our copywriters. I also made a reconsideration request and there was no manual penalty. I do not have any idea for this drop in traffic. Do you have any suggestions?
Intermediate & Advanced SEO | | tudormarius0 -
Keyword search filter in Google Adwords: broad? exact? phrase?
Hello all I am working in my website and analysing the potential best keywords for the SEO (post/page name and url path name). 1. I am using Google Adwords. Any other tool you would recommend? 2. Which selection should I make in the Google Adwords Keyword Tool in order to know the monthly global searches of the keywords I should target? Exact? Phrase? Broad? For instance, KEYWORD SEARCH:"Information about Madrid" BROAD MATCH: 300,000 EXACT MATCH: 1,500 Te potential of the keyword is 300,000? 300,000 searches are undertaken on a month that contains that sentence and its variations? Or the relevant keyword potential is the exacta match traffic? Thank you very much! Antonio
Intermediate & Advanced SEO | | aalcocer20030