Question spam malware causing many indexed pages
-
Hey Mozzers,
I was speaking with a friend today about a site that he has been working on that was infected when he began working on it. Here (https://www.google.ca/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=site:themeliorist.ca) you can see that the site has 4400 indexed pages, but if you scroll down you will see some pages such as /pfizer-viagra-samples/ or /dapoxetine-kentucky/. All of these pages are returning 404 errors, and I ran it through SEO spider just to see if any of these pages would show up, and they don't.
This is not an issue for a client, but I am just curious why these pages are still hanging around in the index. Maybe others have experience this issue too.
Cheers,
-
Hey
You can use the URL removal tool to expedite this and it is one of the few times that Google actually recommends you do so: https://support.google.com/webmasters/answer/1269119?hl=en. Likewise, the URLs will eventually fall away but it can take some time.
The most important thing here is to ensure the site is 100% protected going forwards and does not get reinfected. The pharma hack often has three backdoors in WP itself, a plugin and often the database. These can go for months without being called and suddenly, the site is reinfected again and they are getting better all the time at making this harder and harder to clean up.
This is worth a read:
http://blog.sucuri.net/2010/07/understanding-and-cleaning-the-pharma-hack-on-wordpress.htmlWe often also see a second degree payload with some black hat SEO and outbound links on the site so even when you get rid of the main problem, you may have a few small residual problems. I would suggest an SEO audit, some pro active security and at very least review the outbound links from the site to make sure they are all legit (Screaming Frog is your friend here and it will show external links + linking page).
Hope that helps!
Marcus -
Sometimes search engines do not update as frequently as we would like them to. Since you've already verified that these pages no longer exist on your site, I would also suggest that you actively try to have them removed using your Google Webmaster Tools account.
Google source: https://support.google.com/webmasters/answer/1663419?hl=en
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Home page suddenly dropped from index!!
A client's home page, which has always done very well, has just dropped out of Google's index overnight!
Intermediate & Advanced SEO | | Caro-O
Webmaster tools does not show any problem. The page doesn't even show up if we Google the company name. The Robot.txt contains: Default Flywheel robots file User-agent: * Disallow: /calendar/action:posterboard/
Disallow: /events/action~posterboard/ The only unusual thing I'm aware of is some A/B testing of the page done with 'Optimizely' - it redirects visitors to a test page, but it's not a 'real' redirect in that redirect checker tools still see the page as a 200. Also, other pages that are being tested this way are not having the same problem. Other recent activity over the last few weeks/months includes linking to the page from some of our blog posts using the page topic as anchor text. Any thoughts would be appreciated.
Caro0 -
Google indexing only 1 page out of 2 similar pages made for different cities
We have created two category pages, in which we are showing products which could be delivered in separate cities. Both pages are related to cake delivery in that city. But out of these two category pages only 1 got indexed in google and other has not. Its been around 1 month but still only Bangalore category page got indexed. We have submitted sitemap and google is not giving any crawl error. We have also submitted for indexing from "Fetch as google" option in webmasters. www.winni.in/c/4/cakes (Indexed - Bangalore page - http://www.winni.in/sitemap/sitemap_blr_cakes.xml) 2. http://www.winni.in/hyderabad/cakes/c/4 (Not indexed - Hyderabad page - http://www.winni.in/sitemap/sitemap_hyd_cakes.xml) I tried searching for "hyderabad site:www.winni.in" in google but there also http://www.winni.in/hyderabad/cakes/c/4 this link is not coming, instead of this only www.winni.in/c/4/cakes is coming. Can anyone please let me know what could be the possible issue with this?
Intermediate & Advanced SEO | | abhihan0 -
Is 301 redirecting your index page to the root '/' safe to do or do you end up in an endless loop?
Hi I need to tidy up my home page a little, I have some links to our index.html page but I just want them to go to the root '/' so I thought I could 301 redirect it. However is this safe to do? I'm getting duplicate page notifications in my analytic reportings tools about the home page and need a quick way to fix this issue. Many thanks in advance David
Intermediate & Advanced SEO | | David-E-Carey0 -
My indexed pages count is shrinking in webmaster tools. Is this normal ?
I noticed that our total # of indexed pages dropped recently by a substantial amount (see chart below) Is this normal? http://imgur.com/4GWzkph Also, 3 weeks after this started dropping, we got a message on increased # of crawl errors and found that a site update was causing 300+ new 404s. could this be related ?
Intermediate & Advanced SEO | | znotes0 -
I have removed over 2000+ pages but Google still says i have 3000+ pages indexed
Good Afternoon, I run a office equipment website called top4office.co.uk. My predecessor decided that he would make an exact copy of the content on our existing site top4office.com and place it on the top4office.co.uk domain which included over 2k of thin pages. Since coming in i have hired a copywriter who has rewritten all the important content and I have removed over 2k pages of thin pages. I have set up 301's and blocked the thin pages using robots.txt and then used Google's removal tool to remove the pages from the index which was successfully done. But, although they were removed and can now longer be found in Google, when i use site:top4office.co.uk i still have over 3k of indexed pages (Originally i had 3700). Does anyone have any ideas why this is happening and more importantly how i can fix it? Our ranking on this site is woeful in comparison to what it was in 2011. I have a deadline and was wondering how quickly, in your opinion, do you think all these changes will impact my SERPs rankings? Look forward to your responses!
Intermediate & Advanced SEO | | apogeecorp0 -
Duplicate Content From Indexing of non- File Extension Page
Google somehow has indexed a page of mine without the .html extension. so they indexed www.samplepage.com/page, so I am showing duplicate content because Google also see's www.samplepage.com/page.html How can I force google or bing or whoever to only index and see the page including the .html extension? I know people are saying not to use the file extension on pages, but I want to, so please anybody...HELP!!!
Intermediate & Advanced SEO | | WebbyNabler0 -
"Too many links" - PageRank question
This question seems to come up a lot. 70 flat page site. For ease of navigation, I want to link every page to one-another. Pure CSS Dropdown menu with categories - each expanding to each of the subpage. Made, implemented, remade smartphone friendly. Hurray. I thought this was an SEO principle - ensuring good site navigation and good internal linking. Not forcing your users to hit "back". Not forcing your users to jump through hoops. But unless I've misread http://www.seomoz.org/blog/how-many-links-is-too-many then this is something that's indirectly penalised by Google because a site with 70 links from its homepage only lets each sub-page inherit 1/80th of its PageRank. Good site navigation vs your subpages are invisible on Google.
Intermediate & Advanced SEO | | JamesFx0 -
To index or not to index search pages - (Panda related)
Hi Mozzers I have a WordPress site with Relevanssi the search engine plugin, free version. Questions: Should I let Google index my site's SERPS? I am scared the page quality is to thin, and then Panda bear will get angry. This plugin (or my previous search engine plugin) created many of these "no-results" uris: /?s=no-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Akids+wall&cat=no-results&pg=6 I have added a robots.txt rule to disallow these pages and did a GWT URL removal request. But links to these pages are still being displayed in Google's SERPS under "repeat the search with the omitted results included" results. So will this affect me negatively or are these results harmless? What exactly is an omitted result? As I understand it is that Google found a link to a page they but can't display it because I block GoogleBot. Thanx in advance guys.
Intermediate & Advanced SEO | | ClassifiedsKing0