Google Indexing Pages with Made Up URL
-
Hi all,
Google is indexing a URL on my site that doesn't exist, and never existed in the past. The URL is completely made up. Anyone know why this is happening and more importantly how to get rid of it.
Thanks
-
Hi Brian
Dan (Moz Associate) here. Bernadette and Excal pretty much nailed it. Just wanted to add that OSE, Search Console and other links tools may not always display every single link that exists out there on the web (especially OSE - OSE is the most 'filtered' index, showing mostly quality/relevant links and filtering out the most spam etc).
Regardless, the best course of action is indeed to be sure your broken pages return a proper 404 status code, and Google will handle the rest
-
Agree with Bernadette that this is most likely a hacker / spammer taking advantage of a configuration issue with your website. If you're using a CMS (Wordpress/Joomla/Drupal etc.) make sure that it has been properly configured (or have your website developer do it).
I had a similar instance with a website I inherited a few years back where there was a configuration issue on the CMS that enabled individuals to set themselves up as users and a blogging extension, which had an out of the box configuration issue enabling anyone to create blog posts. Whilst the blogging tool was set to require admin approval to make the article live and visible on the site, once the article was created, it was still somehow able to be indexed by Google which created one hell of a mess.
Fixing the issue in the CMS/Blogging extension was quite simple but the cleanup took a long while and over a period of months I had to disavow a continuing stream of junk links and spent a lot of time writing to other webmasters advising them of the issue with their site so they could remove. Nearly 3 years down the line I still get a few of these pop up from time to time, as there are obviously other sites that have not plugged the gap and updated their blogging tool and as such contain this massive list of dodgy links from link spammers.
If you are using a CMS I would recommend that you, or your webmaster, check the list of authorised users and, if there are any that you do not recognise or you did not create then block them; and immediately take a look at your CMS security settings to ensure that all new users require Admins to approve/activate them before they can do anything.
Unfortunately with this stuff, once the exploits are discovered it is quickly disseminated across the internet and every link spammer (and his dog) tend to jump on-board, so the quicker you can plug the leak and commence remediation the better. Good luck
-
Brian, that's definitely an issue. If it's not delivering a 404 error when you go to a non-existent page on your site, that's the problem. I could theoretically go to yourdomain.com/aslksjdltkjlkjalskdj.html, make a link to it, and Google would index the page.
Check with your web developer to see how you can make sure that 404 error pages (page not found) delivers a 404 error in the server header.
There are lots of ways that Google will discover new URLs (even someone browsing with Google Chrome might allow Google to discover a new URL and then crawl it). So, you'll want to make sure that you have this fixed on your site.
-
Hi Bernadette,
Thanks for your response. I checked OSE and Search Console and can't find any links pointing to the URL. I did the server header check and it's delivering a 200 OK response.
-
Brian, when this happens, there is typically one reason: somewhere there is a link with that URL in it. What we've seen before is that oftentimes those links are created by hackers or spammers that then try to create content on your site with that URL. For example, when a site is hacked, they will create a page on your site and then link to it.
Without the URL (or the page name without your domain name), it's tough for me to see what might be causing this. But, there has to be a link somewhere to it in order for Google to want to index it.
What I would do is use a server header check tool (such as http://www.rexswain.com/httpview.html) to see if the page has a "200 OK" server response or a 404 error. Google typically doesn't index pages that deliver 404 errors. It could be that the server is set up to deliver a "page not found" on your site but it comes up with a "200 OK" in the server header, so Google indexes the page.
Check your site to see if there is a link to the page. If the link exists, then fix it. Then, look at Majestic.com or Open Site Explorer to see if they show any links from other sites to the page. If those links exist, see if you can get rid of those links.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Keywords are indexed on the home page
Hello everyone, For one of our websites, we have optimized for many keywords. However, it seems that every keyword is indexed on the home page, and thus not ranked properly. This occurs only on one of our many websites. I am wondering if anyone knows the cause of this issue, and how to solve it. Thank you.
Technical SEO | | Ginovdw1 -
Pages not indexed
Hey everyone Despite doing the necessary checks, we have this problem that only a part of the sitemap is indexed.
Technical SEO | | conversal
We don't understand why this indexation doesn't want to take place. The major problem is that only a part of the sitemap is indexed. For a client we have several projects on the website with several subpages, but only a few of these subpages are indexed. Each project has 5 to 6 subpages. They all should be indexed. Project: https://www.brody.be/nl/nieuwbouwprojecten/nieuwbouw-eeklo/te-koop-eeklo/ Mainly subelements of the page are indexed: https://www.google.be/search?source=hp&ei=gZT1Wv2ANouX6ASC5K-4Bw&q=site%3Abrody.be%2Fnl%2Fnieuwbouwprojecten%2Fnieuwbouw-eeklo%2F&oq=site%3Abrody.be%2Fnl%2Fnieuwbouwprojecten%2Fnieuwbouw-eeklo%2F&gs_l=psy-ab.3...30.11088.0.11726.16.13.1.0.0.0.170.1112.8j3.11.0....0...1c.1.64.psy-ab..4.6.693.0..0j0i131k1.0.p6DjqM3iJY0 Do you have any idea what is going wrong here?
Thanks for your advice! Frederik
Digital marketeer at Conversal0 -
Google indexing .com and .co.uk site
Hi, I am working on a site that is experiencing indexation problems: To give you an idea, the website should be www.example.com however, Google seems to index www.example.co.uk as well. It doesn’t seem to honour the 301 redirect that is on the co.uk site. This is causing quite a few reporting and tracking issues. This happened the first time in November 2016 and there was an issue identified in the DDOS protection which meant we would have to point www.example.co.uk to the same DNS as www.example.com. This was implemented and made no difference. I cleaned up the htaccess file and this made no difference either. In June 2017, Google finally indexed the correct URL, but I can’t be sure what changed it. I have now migrated the site onto https and www.example.co.uk has been reindexed in Google alongside www.example.com I have been advised that the http needs to be removed from DDOS which is in motion I have also redirected http://www.example.co.uk straight to https://www.example.com to prevent chain redirects I can’t block the site via robot.txt unless I take the redirects off which could mean that I lose my rankings. I should also mention that I haven't actually lost any rankings, it's just replaced some URLs with co.uk and others have remained the same. Could you please advise what further steps I should take to ensure the correct URL’s are indexed in Google?
Technical SEO | | Niki_10 -
Google Page speed
I get the following advice from Google page speed: Suggestions for this page The following resources have identical contents, but are served from different URLs. Serve these resources from a consistent URL to save 1 request(s) and 77.1KiB. http://www.irishnews.com/ http://www.irishnews.com/index.aspx I'm not sure how to fix this the default page is http://www.irishnews.com/index.aspx, anybody know what need to be done please advise. thanks
Technical SEO | | Liammcmullen0 -
What is the best URL designed for a product page?
Should a product page URL include the category name and subcategory name in it? Most ecommerce platforms it seems are designed to do have the category and sub-category names included in the URL followed by the product name. If that is the case and the same product is listed in more then 1 category and sub-category then will that product have 2 unique urls and as a result be treated as 2 different product pages by google? And then since it is the same product in two places on the site won't google treat those 2 pages as having duplicate content? SO is it best to not have the category and sub-category names in the URL of a product page? And lastly, is there a preferred character limit for a URL to be less than in size? Thanks!
Technical SEO | | gallreddy0 -
Site being indexed by Google before it has launched
We are currently coming towards the end of a site migration, and are at the final stage of testing redirects etc. However, to our horror we've just discovered Google has started indexing the new site. Any ideas on how this could have happened? I have most recently asked for robots.txt to exclude anything with a certain parameter in URL. Is there a chance this, wrongly implemented, could have caused this?
Technical SEO | | Sayers0 -
Do pages that are in Googles supplemental index pass link juice?
I was just wondering if a page has been booted into the supplemental index for being a duplicate for example (or for any other reason), does this page pass link juice or not?
Technical SEO | | FishEyeSEO0 -
Page not being indexed
Hi all, On our site we have a lot of bookmaker reviews, and we are ranking pretty good for most bookmaker names as keywords, however a single bookmaker seems to have been shunned by Google. For a search "betsafe" in Denmark, this page does not appear among the top 50: http://www.betxpert.com/bookmakere/betsafe All of our other review pages rank in top 10-20 for the bookmaker name as keyword. What to do if Google has "banned" a page? Best regards, Rasmus
Technical SEO | | rasmusbang0