Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
50% Organic Traffic Drop In the last 48 Hours
Hello, My site had a 50% decrease in the last 48 hours (9/26/18) and I looking for ideas/reasons what would cause such a dramatic drop. Year to year organic traffic has been up 40% and September was up 30%. The site has a domain authority of 39 according to Moz and keywords positions have been flat for a few months. I made a change to the code and robots.txt file on Monday, pre-drop. The category pagination pages had a "NoIndex" with a rel =canonical and I removed the "NoIdnex" per: https://www.seroundtable.com/google-noindex-rel-canonical-confusion-26079.html. I also removed "Disallow" in the robots.txt for stuff like "/?dir" because the pages have the rel =canonical. Could this be the reason for drop?? Other possible reasons:
Intermediate & Advanced SEO | | chuck-layton
1. Google Update: I dont think this is it, but ti looks like the last one was August 1st: "Medic" Core Update — August 1, 2018
2. Site was hacked
3. All of keyword positions dropped overnight: I dont think this is it because Bing has also dropped at the same percentage. Any help, thoughts or suggestions would be awesome.0 -
Technical SEO Issues - Traffic Drop
Hi guys, I hope you're all doing well! We're a small personalised gifts company who specialise in the provision of phone cases, mugs, macbook covers and the like. I head up the Digital Marketing but have little experience in the technical side of SEO and have very limited resources in terms of budget and staffing. Over the past few months, I've been working on stripping down the thin content on the site, fixing duplicate content issues and focusing on other digital channels to boost revenue. However, as of recent we've noticed a significant drop in traffic and our rankings. I've tried to diagnose the problem and I'm convinced there are some technical SEO fixes that need to be implemented. Our website is www.mrnutcase.com If any of you have any ideas, I'd love to hear some of them. Greatly appreciated, Danny
Intermediate & Advanced SEO | | DannyNutcase0 -
Site traffic/sales have plummeted
About 2 months ago we relaunched our Ecommerce store on Shopify Plus and have since seen a massive drop in traffic, sales and our most valuable pages are nowhere to be found. Also, GWT is showing that Google is indexing about half of our pages and none of the images are being indexed. We did extensive keyword research, created/implemented a keyword framework, wrote brand new category/product page content, implemented schema markup, optimized our blog content and even did link building where we got some 90+ DA links. We are literally at a loss for what is causing this. Our experience with Shopify Plus has been very poor because it doesn't even do basic SEO stuff so we've had to do a lot of workarounds to make it "SEO friendly". Has anyone else ever switched to Shopify Plus and had similar issues? Is there a silver bullet that you can think of that we are missing that could get the site being indexed/ranking again?
Intermediate & Advanced SEO | | Aquatell0 -
Is it possible to find out where traffic is comming from on someone elses website?
Is it possible to find out where traffic is coming from on someone else website? I want to know where the new buyers are coming from who are interested in outsourcing. Attached are some of the pages they would be looking at. Who are visiting these pages and where are they coming from: https://www.upwork.com/blog/ https://www.upwork.com/hiring/ https://www.upwork.com/i/howitworks/client/ https://www.upwork.com/signup/create-account/client_direct https://www.upwork.com/o/profiles/browse/ https://www.upwork.com/press/ https://www.freelancer.com/ https://www.freelancer.com/about https://www.freelancer.com/info/how-it-works.php https://www.freelancer.com/showcase https://www.freelancer.com/community https://www.freelancer.com/hire/ https://www.freelancer.com/contest/ https://www.freelancer.com/feesandcharges/ https://www.freelancer.com/freelancers/ http://www.guru.com/ http://www.guru.com/howitworks.aspx http://www.guru.com/about/ http://www.guru.com/help/ http://www.guru.com/blog/ http://www.guru.com/blog/category/hiring-advice/ http://www.guru.com/d/freelancers/ http://www.guru.com/directory http://www.guru.com/answers/
Intermediate & Advanced SEO | | Hall.Michael0 -
Keywords Directing Traffic To Incorrect Pages
We're experiencing an issue where we have keywords directing traffic to incorrect child landing pages. For a generic example using fake product types, a keyword search for XL Widgets might send traffic to a child landing page for Commercial Widgets instead. In some cases, the keyword phrase might point a page for a child landing page for a completely different type of product (ex: a search for XL Widgets might direct traffic to XL Gadgets instead). It's tough to figure out exactly why this might be happening, since each page is clearly optimized for its respective keyword phrase (an XL Widgets page, a Commercial Widgets page, an XL Gadgets page, etc), yet one page ends up ranking for another page’s keyword, while the desired page is pushed out of the SERPs. We're also running into an issue where one keyword phrase is pointing traffic to three different child landing pages where none of the ranking pages are the page we've optimized for that keyword phrase, or the desired page we want to rank appears lower in the SERPs than the other two pages (ex: a search for XL Widgets shows XL Gadgets on the first SERP, Commercial Widgets on the second SERP, and then finally XL Widgets down on the third or fourth SERP). We suspect this may be happening because we have too many child landing pages that are targeting keyword terms that are too similar, which might be confusing the search engines. Can anyone offer some insight into why this may be happening, and what we could potentially do to help get the right pages ranking how we'd like?
Intermediate & Advanced SEO | | ShawnHerrick0 -
Keyword search filter in Google Adwords: broad? exact? phrase?
Hello all I am working in my website and analysing the potential best keywords for the SEO (post/page name and url path name). 1. I am using Google Adwords. Any other tool you would recommend? 2. Which selection should I make in the Google Adwords Keyword Tool in order to know the monthly global searches of the keywords I should target? Exact? Phrase? Broad? For instance, KEYWORD SEARCH:"Information about Madrid" BROAD MATCH: 300,000 EXACT MATCH: 1,500 Te potential of the keyword is 300,000? 300,000 searches are undertaken on a month that contains that sentence and its variations? Or the relevant keyword potential is the exacta match traffic? Thank you very much! Antonio
Intermediate & Advanced SEO | | aalcocer20030 -
Declining Organic Traffic despite PR, links and engagement
I have a client site that launched last June and rebranded this February 2012 as http://49thshelf.com The search traffic since Feb has been steadily declining despite some great campaigns to drive traffic and engagement. April down 40% vs. Mar May down 37% Jun down 51% Jul 16% We have a couple of challenges. The site is the only collection of Canadian-authored titles. It's like an Amazon of only Canadian titles. But it's not ecommerce, we direct traffic to other vendors like Amazon or the publisher to buy. We have 40,000 unique products on the site and the descriptions are primarily supplied by the publishers, which means it's the same content on the publisher site as Goodreads, Amazon and anyone else they share data with. Those big players like Amazon and Goodreads use user generated content to alter the descriptions but we don't have that level of activity on the site. Members create reading lists, the editorial staff curate collections on the homepage and there are interviews, blog posts and guest posts. No black hat SEO, no bad links that I can see. Great organic membership growth and interactions. Good activity from social media sites to the site. Good, trusted links from news sites and legit blogs. I don't know what to do to improve the organic traffic. July is the first month that we haven't seen 40-50% drops. Any advice is welcome, thank you!
Intermediate & Advanced SEO | | SoMisguided0 -
Should I 301 Poorly Worded URL's which are indexed and driving traffic
Hi, I'm working on our sites structure and SEO at present and wondering when the benefit I may get from a well written URL, i.e ourDomain / keyword or keyphrase .html would be preferable to the downturn in traffic i may witness by 301 redirecting an existing, not as well structured, but indexed URL. We have a number of odd looking URL's i.e ourDomain / ourDomain_keyword_92.html alongside some others that will have a keyword followed by 20 underscores in a long line... My concern is although i would like to have a keyword or key phrase sitting on its own in a well targeted URL string I don't want to mess to much with pages that are driving say 2% or 3% of our traffic just because my OCD has kicked in.... Some further advice on strategies i could utilise would be great. My current thinking is that if a page is performing well then i should leave the URL alone. Then if I'm not 100% happy with the keyword or phrase it is targeting I could build another page to handle the new keyword / phrase with the aim of that moving up the rankings and eventually taking over from where the other page left off. Any advice is much appreciated, Guy
Intermediate & Advanced SEO | | guycampbell0