Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Fake Links indexing in google
-
Hello everyone,
I have an interesting situation occurring here, and hoping maybe someone here has seen something of this nature or be able to offer some sort of advice.
So, we recently installed a wordpress to a subdomain for our business and have been blogging through it. We added the google webmaster tools meta tag and I've noticed an increase in 404 links. I brought this up to or server admin, and he verified that there were a lot of ip's pinging our server looking for these links that don't exist. We've combed through our server files and nothing seems to be compromised. Today, we noticed that when you do site:ourdomain.com into google the subdomain with wordpress shows hundreds of these fake links, that when you visit them, return a 404 page.
Just curious if anyone has seen anything like this, what it may be, how we can stop it, could it negatively impact us in anyway? Should we even worry about it? Here's the link to the google results.
https://www.google.com/search?q=site%3Amshowells.com&oq=site%3A&aqs=chrome.0.69i59j69i57j69i58.1905j0j1&sourceid=chrome&es_sm=91&ie=UTF-8 (odd links show up on pages 2-3+)
-
Thank you everyone for your responses! The link you sent of the cached pages LynnP was also helpful. As soon as my co-worker who administers the server gets in I'm going to mention to him that we check the subfolders for anything fishy. I know for a fact he looked for subfolders that were suspicious but I'm not sure he may have thought to check the existing folders for sneaky things. Most passwords have been changed... but I will double check.
Again, thanks everyone for your help, very useful!
-
My 2 cents: This does look like a wp hack - been having a nightmare with a recent Pharma hack like JV mentions and honestly I still cannot figure out how exactly they got into the site but suspect through an outdated plugin.
A couple of things to keep in mind are to check your htaccess file for weird lines and have a look for non standard wp files in various folders (things like cache.php or ms-writer.php if I recall right). These files were not showing recent change dates however so it was not as simple as just ftping in and seeing which files had been recently changed (still no idea how they pulled that off). It can also be that all these pages are being spun out of a handful of php files (or the database!) so not 100% the case that you would actually see the subfolders (although in some cases you might). Also seen dev versions of wp on the same server that have not been kept so up to date be used to get into the main production version (pretty sure they were indexed through links sent via gmail emails, thanks google!).
You can check the google cache for any of these pages to see what they looked like and when they were last cached for example: http://webcache.googleusercontent.com/search?q=cache:Y0U-2Yyk3y4J:news.mshowells.com/CI/Ugg-Hazelwood-1437.shtml+
Most of them show late August cache dates so that should help narrow the timeframe. Interesting to note that all pages have a bunch of links at the bottom, some to your site some to other (probably infected) sites. All of the links are now 404s so maybe the hack got taken down by the originator (no idea why just a thought since its a bit odd that all of the links on the external sites also seem to be 404ing now). Needless to say, change all wpadmin, ftp etc passwords to be safe!
-
Hmm...never seen this exactly before - but a few years back we discovered for a client that their reality tv series show (Deadliest Catch) member site had been severely infected by Canadian Pharma phony sites....
Seems the hacker had 'broken' in via a MS update that was not done on their hosting platform site - and it took the tv company almost 4 months to disavow, rebuild and then index and begin to rank again as I remember....i.e. this was NOT a WP issue but a hosting server hack...
But with 20+ pages of Uggs and Nude Men rolling Christians (love that one, eh!) infections, you need to get that totally fixed asap so I'd start with querying the hosting vendor logs...
How comes to mind...if you can not determine where the hack came from - you could kill the subdomain after saving all your articles - recreate it say as "info.mshowells.com" or "advice.mshowells.com" or "counsel.mshowells.com" and reload in the same artices....have had to do that too for another client....
-
Yeah, only 2 of us, server admin guy. We're talking right now and the site is on a brand new VPS that has never been compromised, no strange folder structure, brand new install of Wordpress.. you can see lots of server errors in the error log on the server but the files NEVER existed, and neither of us removed the files. I, personally, do not even have access to the VPS. Only he does, and he is well aware what he's doing and most definitely would have noticed an odd set of folders and would have remembered deleting them. Almost as soon as we made the wordpress install live is when the 404 crawl errors showed up in google, and on the server. We both have seen many instances of wordpress sites being compromised and know what to look for and how to clean it up. This is why this is baffling. Because we're not exactly sure how or in what way they would benefit from this. My server admin thinks these hackers are somehow tricking google somehow... we just both have never seen this and not sure what to expect... very bizarre!
-
That's pretty strange. There isn't another web person there who might have cleaned things up without telling you? Or maybe your server company?
I don't see how these URLs could be indexed if they never existed, so at some point, someone created those pages and they were around long enough to get indexed. Are there any weird spikes in crawl rates or search queries since the launch of the subdomain?
I've seen this kind of hack before. The hacker just drops some folders full of HTML files into the roots. That's why all those links have a two characters sub directory. That was the folder the HTML files were in before someone likely just saw those folders in the root and deleted them. Maybe they didn't realize what they were doing and thought they were just doing the house cleaning?
Doing a "site:mshowells.com/ci/" or "site:mshowells.com/sp/" can show you what I'm talking about.
-
Well, the interesting thing is the links are only showing up on the subdomain news.mshowells.com - which has only existed on the server for maybe 2 - 3 months? Also, when we first noticed them, we checked the server and wordpress and there were no files and nothing was out of order or anything fishy. Everything was and is just fine. We haven't done any cleanup of any sort. And Wordpress & plugins have been kept up to date.
That's why it's weird because at no point were there hacked files or content or anything... so it's a little confusing...
-
Looks like a hack. A hacker somehow got in at some point, dropped a bunch of Ugg Boot affiliate marketing pages and left. Not sure why they are 404ing unless someone already discovered these when they happened and cleaned them up. That could've happened months and months ago.
The 404s shouldn't effect your SEO, but the hack has potential to if it hasn't been cleaned up properly. Do you see a spike in search queries if you look back over the last year or two? That may indicate when the hack occurred and was cleaned up. It's important to know how the hack was cleaned up, so you can ensure that the vulnerabilities have been resolved. If they haven't been, your site is still open to additional attacks, and spam like that can hurt your SEO.
For Wordpress, it's important to keep not only Wordpress itself up to date, but also your plugins (and only use well established plugins, and do a little research on them to make sure people aren't screaming about hacking issues). Hackers search for vulnerabilities in all sorts of places.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
How can I get a photo album indexed by Google?
We have a lot of photos on our website. Unfortunately most of them don't seem to be indexed by Google. We run a party website. One of the things we do, is take pictures at events and put them on the site. An event page with a photo album, can have anywhere between 100 and 750 photo's. For each foto's there is a thumbnail on the page. The thumbnails are lazy loaded by showing a placeholder and loading the picture right before it comes onscreen. There is no pagination of infinite scrolling. Thumbnails don't have an alt text. Each thumbnail links to a picture page. This page only shows the base HTML structure (menu, etc), the image and a close button. The image has a src attribute with full size image, a srcset with several sizes for responsive design and an alt text. There is no real textual content on an image page. (Note that when a user clicks on the thumbnail, the large image is loaded using JavaScript and we mimic the page change. I think it doesn't matter, but am unsure.) I'd like that full size images should be indexed by Google and found with Google image search. Thumbnails should not be indexed (or ignored). Unfortunately most pictures aren't found or their thumbnail is shown. Moz is giving telling me that all the picture pages are duplicate content (19,521 issues), as they are all the same with the exception of the image. The page title isn't the same but similar for all images of an album. Example: On the "A day at the park" event page, we have 136 pictures. A site search on "a day at the park" foto, only reveals two photo's of the albums. 3QolbbI.png QTQVxqY.jpg mwEG90S.jpg
Technical SEO | | jasny0 -
How long does Google takes to re-index title tags?
Hi, We have carried out changes in our website title tags. However, when I search for these pages on Google, I still see the old title tags in the search results. Is there any way to speed this process up? Thanks
Technical SEO | | Kilgray0 -
How to check if an individual page is indexed by Google?
So my understanding is that you can use site: [page url without http] to check if a page is indexed by Google, is this 100% reliable though? Just recently Ive worked on a few pages that have not shown up when Ive checked them using site: but they do show up when using info: and also show their cached versions, also the rest of the site and pages above it (the url I was checking was quite deep) are indexed just fine. What does this mean? thank you p.s I do not have WMT or GA access for these sites
Technical SEO | | linklander0 -
My video sitemap is not being index by Google
Dear friends, I have a videos portal. I created a video sitemap.xml and submit in to GWT but after 20 days it has not been indexed. I have verified in bing webmaster as well. All videos are dynamically being fetched from server. My all static pages have been indexed but not videos. Please help me where am I doing the mistake. There are no separate pages for single videos. All the content is dynamically coming from server. Please help me. your answers will be more appreciated................. Thanks
Technical SEO | | docbeans0 -
Redirecting HTTP to HTTPS - How long does it take Google to re-index the site?
hello Moz We know that this year, Moz changed its domain to moz.com from www.seomoz.org
Technical SEO | | joony
however, when you type "site:seomoz.org" you still can find old urls indexed on Google (on page 7 and above) We also changed our site from http://www.example.com to https://www.example.com
And Google is indexing both sites even though we did proper 301 redirection via htaccess. How long would it take Google to refresh the index? We just don't worry about it? Say we redirected our entire site. What is going to happen to those websites that copied and pasted our content? We have already DMCAed their webpages, but making our site https would mean that their website is now more original than our site? Thus, Google assumes that we have copied their site? (Google is very slow on responding to our DMCA complaint) Thank you in advance for your reply.0 -
How to stop google from indexing specific sections of a page?
I'm currently trying to find a way to stop googlebot from indexing specific areas of a page, long ago Yahoo search created this tag class=”robots-nocontent” and I'm trying to see if there is a similar manner for google or if they have adopted the same tag? Any help would be much appreciated.
Technical SEO | | Iamfaramon0 -
How to stop my webmail pages not to be indexed on Google ??
when i did a search in google for Site:mywebsite.com , for a list of pages indexed. Surprisingly the following come up " Webmail - Login " Although this is associated with the domain , this is a completely different server , this the rackspace email server browser interface I am sure that there is nothing on the website that links or points to this.
Technical SEO | | UIPL
So why is Google indexing it ? & how do I get it out of there. I tried in webmaster tool but I could not , as it seems like a sub-domain. Any ideas ? Thanks Naresh Sadasivan0