Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Help Blocking Crawlers. Huge Spike in "Direct Visits" with 96% Bounce Rate & Low Pages/Visit.
-
Hello,
I'm hoping one of you search geniuses can help me.
We have a successful client who started seeing a HUGE spike in direct visits as reported by Google Analytics. This traffic now represents approximately 70% of all website traffic. These "direct visits" have a bounce rate of 96%+ and only 1-2 pages/visit. This is skewing our analytics in a big way and rendering them pretty much useless. I suspect this is some sort of crawler activity but we have no access to the server log files to verify this or identify the culprit. The client's site is on a GoDaddy Managed WordPress hosting account.
The way I see it, there are a couple of possibilities.
1.) Our client's competitors are scraping the site on a regular basis to stay on top of site modifications, keyword emphasis, etc. It seems like whenever we make meaningful changes to the site, one of their competitors does a knock-off a few days later. Hmmm.2.) Our client's competitors have this crawler hitting the site thousands of times a day to raise bounce rates and decrease the average time on site, which could like have an negative impact on SEO. Correct me if I'm wrong but I don't believe Google is going to reward sites with 90% bounce rates, 1-2 pages/visit and an 18 second average time on site.
The bottom line is that we need to identify these bogus "direct visits" and find a way to block them. I've seen several WordPress plugins that claim to help with this but I certainly don't want to block valid crawlers, especially Google, from accessing the site.
If someone out there could please weigh in on this and help us resolve the issue, I'd really appreciate it. Heck, I'll even name my third-born after you.
Thanks for your help.
Eric
-
Hi SirMax,
Thanks for your input. I appreciate it. We'll add Wordfence to our WordPress toolbox and see if that addresses the issue.
In response to previous posts, thanks to everyone for your input. We were able to apply some filters to remove the bogus bot traffic from the analytics and normalize the data, however, this did not actually resolve the issue and in my eyes is more of a BandAid fix. The evil crawlers are still there, we just can't see them.
Thanks again for all of your input.
Eric
-
Hostname filtering does not work any more. Unfortunately most of the spammers have adapted and are using your website as hostname.
For the WordPress I use Wordfence plugin( using paid version - not affiliated with them in any shape or form beyond paying for their services). In the advance blocking you can set limits on how fast and how many pages crawlers can request. You can also block by country or ip range. It can also show you live traffic with a lot of details ( a lot more then google analytic - more like server log ). It might not be the complete remedy but it can help.
-
I wish I had an answer for how to stop the bots from hitting your site at all - I don't think a good one exists, as any solutions that wouldn't also block real human traffic to your site are going to be easy for spam bots to get around. I think your best bet is just to do everything you can to keep your data as clean as possible.
-
Hi Ruth,
Thanks a bunch for taking the time to respond to my post. Great advice. This is reassuring on a number of levels, however, it doesn't address the underlying issue of how to stop these spam bots in the first place.
We've already started the process of filtering out some of this bogus data. We'll also be integrating some WordPress plugins to see if that helps. That said, if the spam bots are hitting Analytics directly, as opposed to the actual website, WP plugins won't do anything.
Anyway, I appreciate your input and advice. Thanks so much.
Eric
-
Hi Eric,
A few things to reassure you off the bat:
- For what it's worth, there is a huge, HUGE amount of crawler spam happening in the web today. Every site I work on is being hit hard with false referrals and direct visits. I know Google Analytics is working on a solution to better filter these visits out. So I wouldn't be too concerned that it is something a competitor is doing to your site, specifically - it's more likely that it's been caught up in the general wave of spam crawlers.
- It's important to note that when we talk about Google looking at bounce rate and dwell time as part of ranking your site, those numbers are specifically from clicks through from search - that's data that Google can get without using your private web analytics data as a ranking factor, which they've said repeatedly that they don't and won't do. So a bunch of direct visits with high bounce rates will NOT affect your rankings.
So, it's not dangerous, just annoying. On to how to get that data out of your reports:
- Make sure you're not filtering out spam referrers at a View level - this can cause those visits to incorrectly appear as direct traffic.
- You could set up an Advanced Segment in Google Analytics to filter out direct visits with visit times of, say, under 5 seconds. Some real traffic may get caught in that, but it will get the noise levels down.
- The best way to filter out spam bot traffic, in my opinion, is to set up hostname filtering. Here's a post on Megalytic on how to do that: https://megalytic.com/blog/how-to-filter-out-fake-referrals-and-other-google-analytics-spam. Make sure you've also got an "Unfiltered Data" View so you'll still have historic raw data if you need it.
Hope that helps! Good luck.
-
Check webserver log files, or log visits (ip address, user agent, __utma, __utmz, possibly browser fingerprint, etc...)
Analyzing those you can easily find out if the traffic is from scraping bot or humans.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Solved How to solve orphan pages on a job board
Working on a website that has a job board, and over 4000 active job ads. All of these ads are listed on a single "job board" page, and don’t obviously all load at the same time. They are not linked to from anywhere else, so all tools are listing all of these job ad pages as orphans. How much of a red flag are these orphan pages? Do sites like Indeed have this same issue? Their job ads are completely dynamic, how are these pages then indexed? We use Google’s Search API to handle any expired jobs, so they are not the issue. It’s the active, but orphaned pages we are looking to solve. The site is hosted on WordPress. What is the best way to solve this issue? Just create a job category page and link to each individual job ad from there? Any simpler and perhaps more obvious solutions? What does the website structure need to be like for the problem to be solved? Would appreciate any advice you can share!
Reporting & Analytics | | Michael_M2 -
Strange landing page in Google Analytics
Hello MOZ Community, The website in question is https://x-y.com/ When i looked at the landing pages report in GA , x-y.com is appended at the end of every URL like this. https://x-y.com/x-y.com When i open the above URL in GA interface, it shows page not found. This is obvious as there is no such URL.
Reporting & Analytics | | Johnroger
The metrics like sessions, Users, Bounce rate all look good. In the property settings, The default URL is written like this http:// cell-gate.com (Please note that s is missing in property settings). But how is traffic tracked correctly How do i solve this problem. What settings should we change to make the landing pages report look ok Thanks0 -
Using logical operators (AND / OR) in Google Analytics Goal Funnels
When setting up a Funnel within Google Analytics, is it possible to use logical operators (e.g. OR, AND) in the first (required) step of the funnel? For example, suppose I want to track users who visit page1.html AND page2.html before proceeding to the destination goal. I've entered two pages separated by the OR operator, and neither the "Verify this Goal" nor "Save" produces an error message - is it safe to assume that this is working as I intend? Thanks in advance!
Reporting & Analytics | | ahirai0 -
Google Analytics Goal/Event/SOMETHING to show only Wordpress "Posts", not pages, etc
Hi all, Our site is build on Wordpress and formerly the post URL's had the typical date format at the beginning. This made it easy for me to look at, for example, all search traffic to the blog. I would just view URL's containing /2014/ and /2015/ and boom. We have since removed the dates from the URL's with proper redirects etc, which is great, but now I can't figure out a way to look at ONLY the blog in GA. I like to track a KPI of 'search visits to blog posts' and I can't figure out how to now. Can I set up a GA event that only fires when the post type template for blog posts loads? Some other solution? I'm lost here, and there's gotta be a good way to do it...
Reporting & Analytics | | 3DR0 -
Title Tag Capitalization Impact on SERP Rankings and Click Through Rates
My company made a branding decision to use lowercase for all of our title tags. This, of course, means that our titles on SERPs are all lower case. Overwhelmingly, it seems that websites use title case. This makes me wonder if we're shooting ourselves in the foot. Does using lower case titles negatively impact our rankings and/or click through rates? Is there any data out there suggesting that title case has a better click through rate than lower case? Thanks for reading!
Reporting & Analytics | | Solid_Gold0 -
Find Pages with 0 traffic
Hi, We are trying to consolidate the amount of landing pages on our site, is there any way to find landing pages with a particular URL substring which have had 0 traffic? The minimum which appears in google analytics is 1 visit.
Reporting & Analytics | | driveawayholidays0 -
Totally Remove "localhost" entries from Google Analytics
Hello All, In Google Analytics I see a bunch of traffic coming from "localhost:4444 / referral". I had tried once before to create a filter to exclude this traffic source, but obviously I did it wrong since it's still showing up. Here is the filter I have currently: Filter Name: Exclude localhost
Reporting & Analytics | | Robert-B
Filter Type: Custom filter > Exclude
Filter Field: Referral
Filter Pattern: .localhost:4444.
Case Sensitive: No Can anyone see what I'm doing wrong and give me a push in the right direction? Thanks in advance!0 -
Stats show /blog/wp-cron.php at the top. What is it?
Hi, I have worked with websites for years but have no clue when it comes to Wordpress. We have our main website and then a Wordpress blog running in a subfolder that is only about a year old. The blog has only 7 posts so you can see how small it is vs main website with 200 pages. Usually our main index page of the site is at the top of the stats with the most views and this page /blog/wp-cron.php is about 30% lower. Now suddenly over the last month this page has jumped to the top and accessed almost as much as the home page of the site. We took a big hit with the latest Google Update so we are tyring to determine if there is anything technical in our site that has caused an issue. Thanks in advance Force7
Reporting & Analytics | | Force70