Googlebot crawling partial URLs
-
Hi guys,
I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL.
Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'.
This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders.
Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening..
Thanks!
-
This is why I love this forum. We recently started seeing these urls in our GWT report. We have hundreds of truncated urls that end in "..." that go nowhere. We can't figure out where these are coming from. We thought it could be G's relatively new privacy policy w/ not passing along the data, but we're not sure. Anyone have any thoughts on that?
Thanks!
-
@vitalscom - it's at least good to know someone else has experienced this!
Due to the volume I don't consider doing 301s a permanent solution. Fortunately there is a noindex on our 404 page so Google et al shouldn't take these errors into consideration.
-
I'm seeing it too - It looks like it's coming from Superpages but the truncated URLs are not actually hyperlinks, so why is Google following them is a good question.
http://swbd-out.superpages.com/webresults.htm?qkw=Find+A+Physician&qcat=web
I'm fixing this on my end with a modrewrite in HTACCESS, all of my sites truncated URL problems either end in ".." or "..." so any URL that ends in those two instances will get 301 redirected to the homepage.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Improving Crawl Efficieny
Hi I'm reading about crawl efficiency & have looked in WMT at the current crawl rate - letting Google optimise this as recommended. What it's set to is 0.5 requests every 2 seconds, which is 15 URLs every minute. To me this doesn't sound very good, especially for a site with over 20,000 pages at least? I'm reading about improving this but if anyone has advice that would be great
Intermediate & Advanced SEO | | BeckyKey1 -
Crawl Depth improvements
Hi I'm checking the crawl depth report in SEM rush, and looking at pages which are 4+ clicks away. I have a lot of product pages which fall into this category. Does anyone know the impact of this? Will they never be found by Google? If there is anything in there I want to rank, I'm guessing the course of action is to move the page so it takes less clicks to get there? How important is the crawl budget and depth for SEO? I'm just starting to look into this subject Thank you
Intermediate & Advanced SEO | | BeckyKey0 -
Duplicate page url crawl report
Details: Hello. Looking at the duplicate page url report that comes out of Moz, is the best tactic to a) use 301 redirects, and b) should the url that's flagged for duplicate page content be pointed to the referring url? Not sure where the 301 redirect should be applied... should this url, for example: <colgroup><col width="452"></colgroup>
Intermediate & Advanced SEO | | compassseo
| http://newgreenair.com/website/blog/ | which is listed in the first column of the Duplicate Page Content crawl, be pointed to referring url in the same spreadsheet? Or, what's the best way to apply the 301 redirect? thanks!0 -
Duplicate URL Parameters for Blog Articles
Hi there, I'm working on a site which is using parameter URLs for category pages that list blog articles. The content on these pages constantly change as new posts are frequently added, the category maybe for 'Heath Articles' and list 10 blog posts (snippets from the blog). The URL could appear like so with filtering: www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016 www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=1 All pages currently have the same Meta title and descriptions due to limitations with the CMS, they are also not in our xml sitemap I don't believe we should be focusing on ranking for these pages as the content on here are from blog posts (which we do want to rank for on the individual post) but there are 3000 duplicates and they need to be fixed. Below are the options we have so far: Canonical URLs Have all parameter pages within the category canonicalize to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general and generate dynamic page titles (I know its a good idea to use parameter pages in canonical URLs). WMT Parameter tool Tell Google all extra parameter tags belong to the main pages (e.g. www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=3 belongs to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general). Noindex Remove all the blog category pages, I don't know how Google would react if we were to remove 3000 pages from our index (we have roughly 1700 unique pages) We are very limited with what we can do to these pages, if anyone has any feedback suggestions it would be much appreciated. Thanks!
Intermediate & Advanced SEO | | Xtend-Life0 -
Cleaning up backlinks and changing URLs
Currently we are performing very poorly in organic clicks. We are a e-commerce site with over 2000 products. Issues we thought plagued us: Copied Images from competitors Site wide duplicate content duplicate content from competitor site Number of internal links on a page (300+) Bad backlinks (2.3k from 22 domains and ips) being linked to from sites like m.biz URLs URLs are abbreviated, over 50% lack our keywords Lack of meta descriptions, or too long meta descriptions Current State of fixing these issues: 50% images are now our own Site wide duplicate content near 100% completed Internal links have been dealt with Rewrote content for every product 90% of meta descriptions are fixed From all of these changes we have yet to see increase in traffic...10% increase at best in organic clicks. We think we have penalties on certain URLs. My question for the MOZ community is what is the best way to attack the lack of organic clicks. Our main competition is getting 900% more clicks than us. Any more information you need on the topic let me know and will get back to you.
Intermediate & Advanced SEO | | TITOJAX0 -
Google Disavow wrong Sample URLs
Sometime last year we were penalized for unnatural links by Google. We followed a couple of good posts on MOZ on how to go about correcting all of this (Excel Showing work, Google Docs, Ahrefs, semrush, etc...) and submitted a reconsideration request as well as a list of urls using Disavow Tool. So today, about 20 days later, we get a reply from Google stating that we still have some links that are outside their quality guidelines with 3 examples. Problem is, none of the 3 examples they listed have our website on them. I even did a View Source and checked all our inbound link reports for these websites. All of the 3 examples are Lawyer / Legal Advise websites and ours has nothing to do with this. Any ideas on how to reply and ask them to double check? Maybe they mixed up with another account they were working on at the same time.
Intermediate & Advanced SEO | | Apex-Lighting0 -
Numbers (2432423) in URL
Hello All Mozers, Quick question on URL. I know URL is important and should include keywords and all that but my question is does including numbers (not date or page numbers but numbers for internal use) in the URL affect SEO? For example, www.domain.com/screw-driver,12,1,23345.htm Is that any better or worse than www.domain.com/screw-driver.htm? I understand that this is not user friendly but in SEO stand point does it hurt ranking? What's your opinion on this? Thank you!
Intermediate & Advanced SEO | | TommyTan0 -
Canonical url question
i just search seomoz tooll it say duplicate content for www.mysite.com and www.mysite.com/index.php should i use canonical url for this ? is yes then is this right ?
Intermediate & Advanced SEO | | constructionhelpline0