Googlebot crawling partial URLs
-
Hi guys,
I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL.
Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'.
This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders.
Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening..
Thanks!
-
This is why I love this forum. We recently started seeing these urls in our GWT report. We have hundreds of truncated urls that end in "..." that go nowhere. We can't figure out where these are coming from. We thought it could be G's relatively new privacy policy w/ not passing along the data, but we're not sure. Anyone have any thoughts on that?
Thanks!
-
@vitalscom - it's at least good to know someone else has experienced this!
Due to the volume I don't consider doing 301s a permanent solution. Fortunately there is a noindex on our 404 page so Google et al shouldn't take these errors into consideration.
-
I'm seeing it too - It looks like it's coming from Superpages but the truncated URLs are not actually hyperlinks, so why is Google following them is a good question.
http://swbd-out.superpages.com/webresults.htm?qkw=Find+A+Physician&qcat=web
I'm fixing this on my end with a modrewrite in HTACCESS, all of my sites truncated URL problems either end in ".." or "..." so any URL that ends in those two instances will get 301 redirected to the homepage.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When I crawl my website I have urls with (#!162738372878) at the end of my urls
When I crawl my website I have urls with (#!162738372878) at the end of my urls. I used screaming frog to look check my website and I seen these. My normal urls are in there too, but each of them have a copy with this strange symbol and number at the end. I used a website builder called homestead to make the website and I seen a bunch of there urls in my crawl as well - http://editor.homestead.com/faq is an example I recently created a new website with their new website builder and transferred it to my old domain. However, I didnt know they didnt offer 301 redirects or canonical tags(learned about those afterwards) and I changed my page names. So they recommended I leave the old website published along with the new website. So if I search my website name on google, sometimes both will show in the results. I just want to sort this all out somehow. My website is www.coastlinetvinstalls.com Any feedback is greatly appreciated. Thanks, Matt
Intermediate & Advanced SEO | | Matt160 -
Change of URLs - Part of Migration
We are looking to change our URLs to this format /SKU/TITLE/COLOUR as part of our SEO migration.
Intermediate & Advanced SEO | | christwix
e.g. https://example.com.au/ac-rck-b/rolla-crew-knit/berry.html As of the moment, our URLs are TITLE/NO
e.g. https://example.com.au/rolla-crew-knit/6562563.html
(Shopify is creating a random number on the end of the URL which is representing a different colour) Is this fine SEO wise? Will this affect rankings and user experience?0 -
URL Structure For E-commerce Sites
Hi Guys, I was wondering what would be the optimal and best URL structure for sub-categories on a E-commerce site for SEO purposes. Example if my category was dresses and I had multiple sub-categories within dresses would 1 or 2 below be the better URL structure? 1) Domain + Category + Sub-Category be the most suitable URL structure: Sleeveless Dresses URL: clothingstore.com/dresses/sleeveless-dresses Midi Dresses URL: clothingstore.com/dresses/midi-dresses 2) OR would excluding the category be better Domain + Sub-Category like: Sleeveless Dresses URL: clothingstore.com/sleeveless-dresses Midi Dresses URL: clothingstore.com/midi-dresses Do you think it makes much of a difference, is shorter better and more effective in this case? E.g. Rand discuses in this article: https://moz.com/blog/15-seo-best-practices-for-structuring-urls that having the keyword in the URL serves as anchor text, so wouldn't having additional keywords dilute value in this case? Plus he mentions shorter URLs the better. Cheers, Chris
Intermediate & Advanced SEO | | jayoliverwright1 -
Redirect to url with parameter
I have a wiki (wiki 1) where many of the pages are well index in google. Because of a product change I had to create a new wiki (wiki 2) for the new version of my product. Now that most of my customers are using the new version of my product I like to redirect the user from wiki 1 to wiki 2. An example of a redirect could be from wiki1.website.com/how_to_build_kitchen to wiki2.website.com/how_to_build_kitchen. Because of a technical issue the url I redirect to, needs to have a parameter like "?" so the example will be wiki2.website.com/how_to_build_kitchen? Will the search engines see it as I have two pages with same content?
Intermediate & Advanced SEO | | Debitoor
wiki2.website.com/how_to_build_kitchen
and
wiki2.website.com/how_to_build_kitchen? And will the SEO juice from wiki1.website.com/how_to_build_kitchen be transfered to wiki2.website.com/how_to_build_kitchen?0 -
Will Canonical tag on parameter URLs remove those URL's from Index, and preserve link juice?
My website has 43,000 pages indexed by Google. Almost all of these pages are URLs that have parameters in them, creating duplicate content. I have external links pointing to those URLs that have parameters in them. If I add the canonical tag to these parameter URLs, will that remove those pages from the Google index, or do I need to do something more to remove those pages from the index? Ex: www.website.com/boats/show/tuna-fishing/?TID=shkfsvdi_dc%ficol (has link pointing here)
Intermediate & Advanced SEO | | partnerf
www.website.com/boats/show/tuna-fishing/ (canonical URL) Thanks for your help. Rob0 -
Overly-Dynamic URL
Hi, We have over 5000 pages showing under Overly-Dynamic URL error Our ecommerce site uses Ajax and we have several different filters like, Size, Color, Brand and we therefor have many different urls like, http://www.dellamoda.com/Designer-Pumps.html?sort=price&sort_direction=1&use_selected_filter=Y http://www.dellamoda.com/Designer-Accessories.html?sort=title&use_selected_filter=Y&view=all http://www.dellamoda.com/designer-handbags.html?use_selected_filter=Y&option=manufacturer%3A&page3 Could we use the robots.txt file to disallow these from showing as duplicate content? and do we need to put the whole url in there? like: Disallow: /*?sort=price&sort_direction=1&use_selected_filter=Y if not how far into the url should be disallowed? So far we have added the following to our robots,txt Disallow: /?sort=title Disallow: /?use_selected_filter=Y Disallow: /?sort=price Disallow: /?clearall=Y Just not sure if they are correct. Any help would be greatly appreciated. Thank you,Kami
Intermediate & Advanced SEO | | dellamoda2 -
Rewriting URL
I'm doing a major URL rewriting on our site to make the URL more SEO friendly as well as more comfortable and intuitive for our users. Our site has a lot of indexed pages, over 250k. So it will take Google a while to reindex everything. I was thinking that when Google Bot encounters the new URLs, it will probably figure out it's duplicate content with the old URL. At least until it recrawls the old URL and get a 301 directing them to the new URL. This will probably lower the ranking of every page being crawled. Am I right to assume this is what will happen? Or is it fine as long as the old URLs get 301 redirect? If it is indeed a problem, what's the best solution? rel="canonical" on every single page maybe? Another approach? Thank you.
Intermediate & Advanced SEO | | corwin0 -
Unable to Crawl my Website
Hi all, I have a website that I am trying to promote, but tried to add it here in SEOMoz and got the following message: We have detected that the root domain evolving-networks.co.uk does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information. Does anyone know why this website cannot be crawled? Please help. Thank you in advance!
Intermediate & Advanced SEO | | LSDigital0