Googlebot and other spiders are searching for odd links in our website trying to understand why, and what to do about it.
-
I recently began work on an existing Wordpress website that was revamped about 3 months ago. https://thedoctorwithin.com. I'm a bit new to Wordpress, so I thought I should reach out to some of the experts in the community.Checking ‘Not found’ Crawl Errors in Google Search Console, I notice many irrelevant links that are not present in the website, nor the database, as near as I can tell. When checking the source of these irrelevant links, I notice they’re all generated from various pages in the site, as well as non-existing pages, allegedly in the site, even though these pages have never existed.
For instance:
- https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/feedback-and-testimonials/ allegedly linked from:
- https://thedoctorwithin.com/category/seminars/newsletters/page/7/newsletters/page/3/ (doesn’t exist)
In other cases, these goofy URLs are even linked from the sitemap. BTW - all the URLs in the sitemap are valid URLs.
Currently, the site has a flat structure. Nearly all the content is merely URL/content/ without further breakdown (or subdirectories). Previous site versions had a more varied page organization, but what I'm seeing doesn't seem to reflect the current page organization, nor the previous page organization.
Had a similar issue, due to use of Divi's search feature. Ended up with some pretty deep non-existent links branching off of /search/, such as:
- https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/consultations/ allegedly linked from:
- https://thedoctorwithin.com/search/newsletters/page/2/feedback-and-testimonials/feedback-and-testimonials/online-continuing-education/ (doesn't exist).
I blocked the /search/ branches via robots.txt. No real loss, since neither /search/ nor any of its subdirectories are valid.
There are numerous pre-existing categories and tags on the site. The categories and tags aren't used as pages. I suspect Google, (and other engines,) might be creating arbitrary paths from these. Looking through the site’s 404 errors, I’m seeing the same behavior from Bing, Moz and other spiders, as well.
I suppose I could use Search Console to remove URL/category/ and URL/tag/. I suppose I could do the same, in regards to other legitimate spiders / search engines. Perhaps it would be better to use Mod Rewrite to lead spiders to pages that actually do exist.
- Looking forward to suggestions about best way to deal with these errant searches.
- Also curious to learn about why these are occurring.
Thank you.
-
Thanks, Kevin.
Glad I'm not the only one.
Disabling tags and categories aren't an option, in my case. Guess I need to look at more of the potential upside. Seems tags and categories, if handled correctly, could provide a new way to engage visitors and search engines.
I've heard people refer to 'spidering budgets, or whatnot'. Guess it's an entirely new topic of discussion... if limiting the spurious spider searching, (from good spiders,) means that said spiders will spend more time on the conventional pathways of a site.
-
Thanks, Vjay.
Did a lot of work fixing links in the database.
The issue was occurring even before implementation of WP super cache, and before the link fixing.
Being new-ish to WP, it seems strange that it's so willing to:
-
provide access via directories that don't really exist:
-
categories, tags, even search, if using a theme-provided site search.
I'm getting better at .htaccess, so I'm able to handle a lot of the old incoming links fairly well. In the case of these weird 'in the mind of the spiders' links, will be try to address these as well.
Thanks for your advice about 404 and 301 plugins. Time to look around and see what other useful tools are out there.
-
-
I have the same issue, I have stopped using tags because of all the irrelevant links they cause. Looking forward to reading the comments on this thread.
KJr
-
Hi There,
Your website is built on WordPress and it looks like that there might be spurious entries in the DB, which might also not be getting deleted due to the WP super cache plugin. You may try to empty your cache and install 'all 404 redirect' and 301 management plugins.
I hope this helps.
Regards,
Vijay
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I want to move some pages of my website to a folder and nav menu in those pages should only show inner page links, will it hurt SEO?
Hi, My website has a few SaaS products, to make my website simple i want to move my website some pages to its specific folder structure , so eg website.com/product1/features
Technical SEO | | webbeemoz
website.com/product1/pricing
website.com/product1/information and same for product2 and so on, the website.com/product1/.. menu will only show the links of product1 and only one link to homepage (possibly in footer). Please share your opinion will it be a good idea, from UI perspective it will be simple , but i am not sure about SEO perspective, please help thanks0 -
Measuring the size of a competitors website?
I think our website is too big, far too many indexed pages. I'd like to do some research on how big our competitors' websites are (how many indexed pages). Is there a way to do this? Cheers, Rhys
Technical SEO | | SwanseaMedicine0 -
How should we handle re-directory links? Should we remove these links?
We are currently cleaning up bad links that were purchased by a previous SEO agency. We have found links on anonym.to pages that redirect traffic to our site automatically. How should this be handled? Should we remove these links?
Technical SEO | | Lorne_Marr0 -
Solutions for too many on-page links?
We have just begun using SEO Moz a few months ago and have been busy cleaning up some of our warnings and errors. One of the errors that has been an issue is ... too many on-page links. I am trying to correct this issue and I am wondering how seo moz counts these links. For instance... we have links to many of our product categories in a drop down from our main menu, those same links are listed in our footer. Does this get counted as two or only one link. If two, should we make one of the link no follow or how would you best suggest correcting this. Our website is www.unikeyhealth.com Since the menu and the footer appear on virtually every page on our site correcting this issue will quickly sort out this problem. Thanks for any advice.
Technical SEO | | unikey0 -
Bad link profile?
Hi Mozzers! We have recently been handed this client due to the former SEO company building up a bad link profile, which resulted in the site dropping off the search results all together. Forcing them to get a new domain. This happened in July last year and we are unsure whether it would be wise to submit a reconsideration request and then 301 their old sites pages to the new domain. Basically I'm asking whether you can spot any spammy links being built in their profile. Here is the old domain: http://www.claimssolicitors.co.uk/ It would be great if you could help me out! 🙂 Thanks
Technical SEO | | Webrevolve0 -
Nofollow links if you have more than one link on a page to the same destination.
Hi, I am wondering if someone can confirm that its best practice to have nofollow on secondary links on a page. For instance the contact page may have a link in the navigation and in the the blurb down the page have another link to the contact page saying contact us here etc.. So in this instance i would put a nofollow on the secondary link in the blurb would this be the best way to impliment this. Many thanks Chris
Technical SEO | | InteractiveRed670 -
How Does Link Juice Pass?
Say there is a link on an authoritative site to my site, and the link points to www.mysite.com. However, I have set all URL variations (https://mysite.com, www.mysite.com, mysite.com, etc.) to redirect to http://mysite.com automatically. Does the link juice from this authoritative site pass through the www.mysite.com URL to http://mysite.com automatically due to the automatic redirect? I guess my question is does the link juice automatically pass on to the destination URL, even though it is not the original URL the authoritative site pointed to?
Technical SEO | | NiallTom0