Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using Similar Expired URLs to Send Traffic to My Site
Thanks in advance for any help! I have an existing website with content on a particular topic. I have discovered a few similar expired URLs that might still get some traffic. One in particular still has a number of valid links from other sites. Would it make sense for me to buy those URLs (which are really cheap) and just use them to send that traffic to my site? If so, am I better using a 301 redirect or having a home page on the new site that just mentions that the old site is expired, and that they might want to instead link over to my site?
Intermediate & Advanced SEO | | alanjosephs0 -
New site. How important is traffic for a new site? And what about domain age?
Hi guys. I've been building a new site because i've seen a real SEO opportunity out there. I'm a mixing professional by trade and so I wanted to take advantage of SEO to help gain more work. Here's the site: www.signalchainstudios.co.uk I'm curious about domain age. This site fairly well optimised for my keywords, and my site got pretty good content on it (i think so anyway). But it's no where to be seen on the SERP's (link at all). Is this just a domain age issue? I'd have though it might be in the top 50 because my site's services are not hard to rank for at all! Also what about traffic? Does Google want to see an 'active' site before it considers 'promoting' it up the ranks? Or are back links and good content the main factor in the equation? Thanks in advance. I love this community to bits 🙂 Isaac.
Intermediate & Advanced SEO | | isaac6631 -
Blog Traffic
Hi all, As of today, we put up approximately 900 high-quality, 100% original articles on our blog. However, we have not been able to generate any good traffic since July when it was first launched (blog.ostanding.com). Any suggestion would be greatly appreciated! Thanks again.
Intermediate & Advanced SEO | | businessowner0 -
Rankings and search traffic fell off a cliff
Hi Moz community, One of my clients has a beast of a website built in ASP.NET (which causes me problems cos I don't have much experience in that) It is a job-site that aggregates job opportunities from other job-sites and provides a job matching service by email etc. They used to have great presence on Google naturally for thousands of job searches. Since Penguin and Penguin 2.0 (I think) their traffic has fallen off a cliff. I have been doing some "off-page" experimentation, seeing if we can fix a lot of issues by re-sculpting their backlink profile (seeing as it was after penguin). but what I have found is that some pages respond to this off page work but some just do not at all, despite how we approach it, such as disavowing previous links building fresh new top quality content links with natural anchor text etc.... Which has lead me to the conclusion that the wider issue is on-page and potentially site structure. Unfortunately as it is ASP.NET I am not so comfortable diagnosing the issues. I think also some issues will be related to dupe content etc.... but I would LOVE to get some input from my learned Moz colleagues. The website is http://www.allthetopbananas.com/ - any tips on how to recover from this dramatic loss of traffic would be massively appreciated. Kind regards
Intermediate & Advanced SEO | | websearchseo0 -
E-Commerce Selling Air Filter. Only 3 Qualities Options in 50 Sizes, Total 150 items. HELP!!!
My online store is selling air furnace filters. We only have 3 different filters to sell. (standard, mid-range quality and high quality) Each filters is available in 50 different sizes. This is a TOTAL of 150 products or 3 products with 50 options!!! My store is setup with the ''150 products'' option. MY PROBLEMS: All the page Title are the same, only the filter is change in the page title. ex: 10x20x1 furnace filters - shop at furnace filters canada 12x20x1 furnace filters - shop at furnace filters canada 14x20x1 furnace filters - shop at furnace filters canada ect... It is the same with the Meta Description, all the same only the size change. It is the same with the product description, all the same, only the size is changing. Trying to come out with 150 different page title, meta and product description is almost impossible. And you know like me, most shoppers will use there filters sizes in there keywords search term or phrase. YES, I have duplicate content all over my store. Is there a solution to this? This is my online store http://www.furnacefilterscanada.com/ Thank you, BigBlaze
Intermediate & Advanced SEO | | BigBlaze2050 -
Having Content be the First thing the bots see
If you have all of your homepage content in a tab set at the bottom of the page, but really would want that to be the first thing Google reads when it crawls your site, is there something you can implement where Google reads your content first before it reads the rest of your site? Does this cause any violations or are there any red flags that get raised from doing this? The goal here would just be to get Google to read the content first, not hide any content
Intermediate & Advanced SEO | | imageworks-2612900 -
Rankings traffic percentages
Hi All, It seems that some keywords that are receiving a fairly high amount of monthly traffic such as 6000 hits are passing very little traffic to my website. for a few keywords recieving this amount of traffic my site ranks on page one. is there a breakdown of what percentage of traffic you should roughly get to your website depending on the rank for a certain keyword. example; position 1 send approximately % traffic to your site position 2 send approximately % traffic to your site etc thank you
Intermediate & Advanced SEO | | wazza_19850 -
My Domain rank is falling but my traffic is improving?
I have been here for one month today and have been reworking many pages to improve my On-Page results for my site www.antiquebanknotes.com I have seen some really nice improvement in my organic, search and non paid keywords. (up 38%, 21% and 29% this week) But last week all of a sudden my domain authority dropped from 10 to 9. Not tragic but still odd since I have been getting some decent results from my optimazations. My competitors have domain authority in the 20's so it's something I am sure I need to work on. I have added links out to relevant sites and added lots of content but my domain authority falls? Is this common when a site makes lots of changes?
Intermediate & Advanced SEO | | Banknotes0