Googlebot and other spiders are searching for odd links in our website trying to understand why, and what to do about it.
-
I recently began work on an existing Wordpress website that was revamped about 3 months ago. https://thedoctorwithin.com. I'm a bit new to Wordpress, so I thought I should reach out to some of the experts in the community.Checking ‘Not found’ Crawl Errors in Google Search Console, I notice many irrelevant links that are not present in the website, nor the database, as near as I can tell. When checking the source of these irrelevant links, I notice they’re all generated from various pages in the site, as well as non-existing pages, allegedly in the site, even though these pages have never existed.
For instance:
- https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/feedback-and-testimonials/ allegedly linked from:
- https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/ (doesn’t exist)
In other cases, these goofy URLs are even linked from the sitemap. BTW - all the URLs in the sitemap are valid URLs.
Currently, the site has a flat structure. Nearly all the content is merely URL/content/ without further breakdown (or subdirectories). Previous site versions had a more varied page organization, but what I'm seeing doesn't seem to reflect the current page organization, nor the previous page organization.
Had a similar issue, due to use of Divi's search feature. Ended up with some pretty deep non-existent links branching off of /search/, such as:
- https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/consultations/ allegedly linked from:
- https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/ (doesn't exist).
I blocked the /search/ branches via robots.txt. No real loss, since neither /search/ nor any of its subdirectories are valid.
There are numerous pre-existing categories and tags on the site. The categories and tags aren't used as pages. I suspect Google, (and other engines,) might be creating arbitrary paths from these. Looking through the site’s 404 errors, I’m seeing the same behavior from Bing, Moz and other spiders, as well.
I suppose I could use Search Console to remove URL/category/ and URL/tag/. I suppose I could do the same, in regards to other legitimate spiders / search engines. Perhaps it would be better to use Mod Rewrite to lead spiders to pages that actually do exist.
- Looking forward to suggestions about best way to deal with these errant searches.
- Also curious to learn about why these are occurring.
Thank you.
-
Thanks, Kevin.
Glad I'm not the only one.
Disabling tags and categories aren't an option, in my case. Guess I need to look at more of the potential upside. Seems tags and categories, if handled correctly, could provide a new way to engage visitors and search engines.
I've heard people refer to 'spidering budgets, or whatnot'. Guess it's an entirely new topic of discussion... if limiting the spurious spider searching, (from good spiders,) means that said spiders will spend more time on the conventional pathways of a site.
-
Thanks, Vjay.
Did a lot of work fixing links in the database.
The issue was occurring even before implementation of WP super cache, and before the link fixing.
Being new-ish to WP, it seems strange that it's so willing to:
-
provide access via directories that don't really exist:
-
categories, tags, even search, if using a theme-provided site search.
I'm getting better at .htaccess, so I'm able to handle a lot of the old incoming links fairly well. In the case of these weird 'in the mind of the spiders' links, will be try to address these as well.
Thanks for your advice about 404 and 301 plugins. Time to look around and see what other useful tools are out there.
-
-
I have the same issue, I have stopped using tags because of all the irrelevant links they cause. Looking forward to reading the comments on this thread.
KJr
-
Hi There,
Your website is built on WordPress and it looks like that there might be spurious entries in the DB, which might also not be getting deleted due to the WP super cache plugin. You may try to empty your cache and install 'all 404 redirect' and 301 management plugins.
I hope this helps.
Regards,
Vijay
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do the terms in a website url drive search hits
I've tried to do a search on a few key words that I knew was on my landing page and I couldn't get Google to find it. So I thought maybe I needed to change my url to reflect a few the terms.
Technical SEO | | Toal0 -
Broken link
I know SEO Moz has a lot of info about 404 301 302 etc but I am trying to figure out easy way to fix two of the broken links from flash. I am redirecting following links with wordpress redirect plug in http://soobumimphotography.com/gallery.php?GalleryID=126&GalleryName=Wedding&OrderNum=1 http://soobumimphotography.com/gallery.php?GalleryID=126&GalleryName=Wedding&OrderNum=1 What would be the best way to solve this? Is there anyway I can remove those?
Technical SEO | | BistosAmerica0 -
Need advice on search listings and link building
Search results on my keyword (engraved wedding glasses) produces several pages of linked domains. (My domain is giftthings.net) Some are good. And admittedly, some are not so good. My question then is simply, why does seomoz link analysis show such a small number of links? And the second part of my question is, "Is there some sort of "magic number", some sort of thresh hold that triggers Google's interest? With a link list that is small but growing, am I missing something in my concern that I'm not moving up in the search listings? I've written a few articles, continuing my work on link building but I remain buried in the search results.
Technical SEO | | AhmadS1 -
Tips to get rid of a link from an infected website ?
Hi, During some netlinking analysis I found that a website linking to one of the sites I do SEO for triggers my antivirus... It seems infected by JS/Dldr.Scripy.A Java script virus. Being the first time I deal with this kind of problem, and having not found any info on the Q&A or anywhere else, I wonder a few things : 1°) How to verify the reality of the threat and be sure it's not a false positive ? Is there some tool to scan the website, maybe an online vrus scanner ? 2°) How to contact the webmaster since I cannot look for a "contact us" page ? I looked in a whois, but I only got the e-mail of his hosting service, can I contact them directly ? 3°) Any tips or important things I should know ? Thanks for your help
Technical SEO | | JohannCR0 -
OSE Link Differential
I have the chrome toolbar installed. In the SERP a site I was looking at had 686 links from 12 domains linking to the root domain. When I checked this site in OSE with filters set to all pages in root domain it shows 65 links from 12 domains. Can anyone explain the difference?
Technical SEO | | waynekolenchuk0 -
What should I do about links coming in that are from link farm type sites?
I just noticed two back links to a couple of sites around pharmaceuticals/attorneys. The one link is to a chinese site with url: http://e.lifestyle.com.cn/fashionweekly/nzj/353093_2.shtml, and the other is to a site called Adroo: http://adroo.com/us/?view=list&list_id=104154&lang=en. Both appear to be some type of link farm sites, one has come in as a nofollow (surprise, you can buy "ads" on their site, both have decent DA. There is no reason for them to link to theses sites, should I find a way to stop the link? Also, on one of the sites we had a dmoz link and it is not showing in OSE? Link is still open in dmoz though. Thanks for any input.
Technical SEO | | RobertFisher0 -
Does the Referral Traffic from a Link Influence the SEO Value of that Link?
If a link exists, and nobody clicks on it, could it still be valuable for SEO? Say I have 1000 links on 500 sites with Domain Authority ranging from 35 to 80. Let's pretend that 900 of those links generate referral traffic. Let's assume that the remaining 100 links are spread between 10 domains of the 500, but nobody ever clicks on them. Are they still valuable? Should an SEO seek to earn more links like those, even though they don't earn referral traffic? Does Google take referral data into account in evaluating links? 5343313-zelda-rogers-albums-zelda-pictures-duh-what-else-would-they-be-picture3672t-link-looks-so-lonely.jpg Sad%20little%20link.jpg
Technical SEO | | glennfriesen1 -
How is link juice passed to links that appear more than once on a given page?
For the sake of simplicity, let's say Page X has 100 links on it, and it has 100 points of link juice. Each page being linked to would essentially get 1 point of link juice. Right? Now let's say Page X links to Page Y 3 times and Page Z 5 times, and every other link only once. Does this mean that Page Y would get 3 "link juice points" and Page Z would get 5? Note: I know that the situation is much more complex than this, such as the devaluation of footer links, etc, etc, etc. However, I am interested to hear peoples take on the above scenario, assuming all else is equal.
Technical SEO | | bheard0