Google Indexing Pages with Made Up URL
-
Hi all,
Google is indexing a URL on my site that doesn't exist, and never existed in the past. The URL is completely made up. Anyone know why this is happening and more importantly how to get rid of it.
Thanks
-
Hi Brian
Dan (Moz Associate) here. Bernadette and Excal pretty much nailed it. Just wanted to add that OSE, Search Console and other links tools may not always display every single link that exists out there on the web (especially OSE - OSE is the most 'filtered' index, showing mostly quality/relevant links and filtering out the most spam etc).
Regardless, the best course of action is indeed to be sure your broken pages return a proper 404 status code, and Google will handle the rest
-
Agree with Bernadette that this is most likely a hacker / spammer taking advantage of a configuration issue with your website. If you're using a CMS (Wordpress/Joomla/Drupal etc.) make sure that it has been properly configured (or have your website developer do it).
I had a similar instance with a website I inherited a few years back where there was a configuration issue on the CMS that enabled individuals to set themselves up as users and a blogging extension, which had an out of the box configuration issue enabling anyone to create blog posts. Whilst the blogging tool was set to require admin approval to make the article live and visible on the site, once the article was created, it was still somehow able to be indexed by Google which created one hell of a mess.
Fixing the issue in the CMS/Blogging extension was quite simple but the cleanup took a long while and over a period of months I had to disavow a continuing stream of junk links and spent a lot of time writing to other webmasters advising them of the issue with their site so they could remove. Nearly 3 years down the line I still get a few of these pop up from time to time, as there are obviously other sites that have not plugged the gap and updated their blogging tool and as such contain this massive list of dodgy links from link spammers.
If you are using a CMS I would recommend that you, or your webmaster, check the list of authorised users and, if there are any that you do not recognise or you did not create then block them; and immediately take a look at your CMS security settings to ensure that all new users require Admins to approve/activate them before they can do anything.
Unfortunately with this stuff, once the exploits are discovered it is quickly disseminated across the internet and every link spammer (and his dog) tend to jump on-board, so the quicker you can plug the leak and commence remediation the better. Good luck
-
Brian, that's definitely an issue. If it's not delivering a 404 error when you go to a non-existent page on your site, that's the problem. I could theoretically go to yourdomain.com/aslksjdltkjlkjalskdj.html, make a link to it, and Google would index the page.
Check with your web developer to see how you can make sure that 404 error pages (page not found) delivers a 404 error in the server header.
There are lots of ways that Google will discover new URLs (even someone browsing with Google Chrome might allow Google to discover a new URL and then crawl it). So, you'll want to make sure that you have this fixed on your site.
-
Hi Bernadette,
Thanks for your response. I checked OSE and Search Console and can't find any links pointing to the URL. I did the server header check and it's delivering a 200 OK response.
-
Brian, when this happens, there is typically one reason: somewhere there is a link with that URL in it. What we've seen before is that oftentimes those links are created by hackers or spammers that then try to create content on your site with that URL. For example, when a site is hacked, they will create a page on your site and then link to it.
Without the URL (or the page name without your domain name), it's tough for me to see what might be causing this. But, there has to be a link somewhere to it in order for Google to want to index it.
What I would do is use a server header check tool (such as http://www.rexswain.com/httpview.html) to see if the page has a "200 OK" server response or a 404 error. Google typically doesn't index pages that deliver 404 errors. It could be that the server is set up to deliver a "page not found" on your site but it comes up with a "200 OK" in the server header, so Google indexes the page.
Check your site to see if there is a link to the page. If the link exists, then fix it. Then, look at Majestic.com or Open Site Explorer to see if they show any links from other sites to the page. If those links exist, see if you can get rid of those links.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can Google index the text content in a PDF?
I really really thought the answer was always no. There's plenty of other things you can do to improve search visibility for a PDF, but I thought the nature of the file type made the content itself not-parsable by search engine crawlers... But now, my client's competitor is ranking for my client's brand name with a PDF that contains comparison content. Thing is, my client's brand isn't in the title, the alt-text, the url... it's only in the actual text of the PDF. Did I miss a major update? Did I always have this wrong?
Technical SEO | | LindsayDayton0 -
Google Not Indexing Submitted Images
Hi Guys! My question isn't too dissimilar to one asked a couple of years ago, regarding Google and image indexing, but having put my web address into a Google image search, I get a return of 15 images, so something isn't right. 5 months ago I submitted our 'new' site to Google webmaster. We have just moved it onto a Shopify platform. They (Shopify) are good at providing places to add titles and Alt tags and likewise we fill them in (so that box ticked!) However I have noticed over the last couple of months that despite 161 images being submitted, only 51 have been indexed. Furthermore and as I said earlier, when you put our site, site:http://www.hartnackandco.com into Google images, it only returns a total of 15 images. Any suggestions and help would be wonderful! Cheers Nick
Technical SEO | | nick_HandCo0 -
My sites "pages indexed by Google" have gone up more than qten-fold.
Prior to doing a little work cleaning up broken links and keyword stuffing Google only indexed 23/333 pages. I realize it may not be because of the work but now we have around 300/333. My question is is this a big deal? cheers,
Technical SEO | | Billboard20120 -
Blocked URL parameters can still be crawled and indexed by google?
Hy guys, I have two questions and one might be a dumb question but there it goes. I just want to be sure that I understand: IF I tell webmaster tools to ignore an URL Parameter, will google still index and rank my url? IS it ok if I don't append in the url structure the brand filter?, will I still rank for that brand? Thanks, PS: ok 3 questions :)...
Technical SEO | | catalinmoraru0 -
Home page URL
Hi, I work on this site: http://www.towerhousetraining.co.uk/about-us. This is the home page URL. Should this be 301'd to: http://www.towerhousetraining.co.uk? I have created a site map, which I submitted to Google Webmaster Tools, which includes these URL's: /about-us, /training-we-offer & /contact-us. There are a total of 3 pages on the website. Webmaster tools has only indexed 2 out of 3 pages. I think this is something to do with the /about-us URL, as when I do a site: search, these pages appear: www.towerhousetraining.co.uk/, /training-we-offer & /contact-us. I am not sure why Google has indexed the home page as www.towerhousetraining.co.uk/ and not /about-us? Is it a bad idea in general not to have your homepage as your root domain? I added a to the homepage, but am wondering if this was the right thing to do? Any help would be appreciated.
Technical SEO | | CWseo0 -
Pages not indexed by Google
We recently deleted all the nofollow values on our website. (2 weeks ago) The number of pages indexed by google is the same as before? Do you have explanations for this? website : www.probikeshop.fr
Technical SEO | | Probikeshop0 -
I have 15,000 pages. How do I have the Google bot crawl all the pages?
I have 15,000 pages. How do I have the Google bot crawl all the pages? My site is 7 years old. But there are only about 3,500 pages being crawled.
Technical SEO | | Ishimoto0 -
Continued Lack of Google Indexing
I run a baseball site (http://www.mopupduty.com) that is in a very good link neighbourhood. ESPN, The Score, USA Today, MSG Network, The Toronto Star, Baseball Prospectucs, etc etc. New content has not been getting indexed on Google ever since the last update. Site has no dup content, 100% original. I can't think of any spammy links, we get organic links day after day. In the past Google has indexed the site in minutes. It currently has expanded site links within Google search. Bing & Yahoo index the site in minutes. Are there any quick fixes I can make to increase my chance to get indexed by Google. Or just keep pumping out content and hope to see a change in the upcoming future?
Technical SEO | | mkoster1