Google tries to index non existing language URLs. Why?
-
Hi,
I am working for a SAAS client. He uses two different language versions by using two different subdomains.
de.domain.com/company for german and en.domain.com for english. Many thousands URLs has been indexed correctly.But Google Search Console tries to index URLs which were never existing before and are still not existing.
de.domain.com**/en/company
en.domain.com/de/**company... and an thousand more using the /en/ or /de/ in between. We never use this variant and calling these URLs will throw up a 404 Page correctly (but with wrong respond code - we`re fixing that
). But Google tries to index these kind of URLs again and again. And, I couldnt find any source of these URLs. No Website is using this as an out going link, etc.
We do see in our logfiles, that a Screaming Frog Installation and moz.com w opensiteexplorer were trying to access this earlier.My Question: How does Google comes up with that? From where did they get these URLs, that (to our knowledge) never existed?
Any ideas? Thanks
-
Hi Hecksler,
Did you ever resolve this?
Quick idea from me is to double check ALL version of your website within Google Search Console. You can now register the entire domain property using DNS: https://searchengineland.com/how-to-set-up-google-search-console-domain-verification-for-site-wide-reporting-data-313256
I found that Google was trying to crawl a very old HTTP sitemap from about five years ago for one of my sites, and thus I was able to delete it.
There's some mixed comments/feeling within the Search Community about whether or not GoogleBot really "guesses" URLs, so it's probably more than likely they are getting the links from somewhere....https://stackoverflow.com/questions/20855082/googlebot-guesses-urls-how-to-avoid-handle-this-crawling
Look forward to hearing from you,
Nick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why google removed my landing pages from index?
I made new website meko.lv. I put many work to it, to make page SEO friendly, sprites, reduced requests added SSL, got google page speed insights score 100/100, but in 2. october all pages in google webmasters disappeared from index. Could you please look at website and say whats wrong with it? They are all search results present in google but for how long. it is so annoying, you put so many work but in result get high spam score. It is obvious that new pages can not get good links in one month https://meko.lv/ google webmasters google page speed score: https://developers.google.com/speed/pagespeed/insights/?url=http%3A%2F%2Fmeko.lv%2F&tab=mobile q1LDHTn
Technical SEO | | Mekounko0 -
Get List Of All Indexed Google Pages
I know how to run site:domain.com but I am looking for software that will put these results into a list and return server status (200, 404, etc). Anyone have any tips?
Technical SEO | | InfinityTechnologySolutions0 -
How could you make a URL/Breadcrumb structure appear different in Google than when you click into site?
I'm seeing a competitor be able to make their URL/Breadcrumb stucture appear different in Google than on the site. Google shows a 3-4 category silo for the page but once clicked the page is off root. How could you do this?
Technical SEO | | TicketCity0 -
Single URL not indexed
Hi everyone! Some days ago, I noticed that one of our URLs (http://www.access.de/karriereplanung/webinare) is no longer in the Google index. We never had any form of penalty, link warning etc. Our traffic by Google is constantly growing every month. This single page does not have an external link pointing to it - only internal links. The page has been indexed all the time. The HTTP status code is 200, there is no noindex or something in the code. I submitted the URL on GWMT to let Google send it to the index. It was crawled successfully by Google, sent to the index 5 days ago - nothing happened, still not indexed. Do you have any suggestions why this page is no longer indexed? It is well linked internally and one click away from the home page. There is still the PR of 5 showing, I always thought that pages with PR are indexed.......
Technical SEO | | accessKellyOCG0 -
How long does it take for Google to index a new site and has anyone experienced serious fluctuations in SERP within 2 weeks after launch?
Hi guys, I have recently launched my ecommerce jewellery site - www.luxuryfinejewellery.com - and noticed some serious swings in SERP over the last couple of weeks. From ranking No 2, 3 and 4 for the keyword 'luxury fine jewellery' on Google.com, the homepage periodically disappears from the Top 50 altogether. I thought it was the Sandbox, as I recently purchased the domain name, within the last 6 weeks, however the fact that it does rank on the 1st page some of the time is a mystery. Has anyone also experienced this? Could you provide some advice on what to expect until the the rankings settle. Thanks in advance, Satbir
Technical SEO | | deluxebydesign0 -
What are the considerations in setting language within the url of multilingual sites?
Is it good practice to use Language-Agnostic + LOCALE=en +LOCALE=fr (as per example below)? If not what is the best way to determine language within a url and why? For example, today we use: http://www.canadapost.ca/cpo/mc/default.jsf (goes to language last used by user) http://www.canadapost.ca/cpo/mc/default.jsf?LOCALE=fr (forces a French-launguage page) http://www.canadapost.ca/cpo/mc/default.jsf?LOCALE=en (forces and English-language page) I think you can get tell Google about these parameters through Webmaster tools to help them properly crawl and understand your content, but if we had the opportunity to change it what should we do?
Technical SEO | | CanadaPost0 -
I am trying to block robots from indexing parts of my site..
I have a few websites that I mocked up for clients to check out my work and get a feel for the style I produce but I don't want them indexed as they have lore ipsum place holder text and not really optimized... I am in the process of optimizing them but for the time being I would like to block them. Most of my warnings and errors on my seomoz dashboard are from these sites and I was going to upload the folioing to the robot.txt file but I want to make sure this is correct: User-agent: * Disallow: /salondemo/ Disallow: /salondemo3/ Disallow: /cafedemo/ Disallow: /portfolio1/ Disallow: /portfolio2/ Disallow: /portfolio3/ Disallow: /salondemo2/ is this all i need to do? Thanks Donny
Technical SEO | | Smurkcreative0 -
Canonical for non-exist URL ?
Hi I have a website what has parameter URL. For example www.example.com/index.php?page_id=1&no=2 I want that search engine see my page URL as; www.example.com/toys/cars But this URL is not exist in my website. And when i externally enter this page it goes to 404 page. If i add canonical url as www.example.com/toys/cars to the page www.example.com/index.php?page_id=1&no=2, what happened ? Is the url at the serp change as www.example.com/toys/cars ?
Technical SEO | | SEMTurkey0