URL Index Removal for Hacked Website - Will this help?
-
My main question is: How do we remove URLs (links) from Google's index and the 1000s of created 404 errors associated with them after a website was hacked (and now fixed)?
The story: A customer came to us for a new website and some SEO. They had an existing website that had been hacked and their previous vendor was non-responsive to address the issue for months. This created THOUSANDS of URLs on their website that were then linked to pornographic and prescription med SPAM sites. Now, Google has 1,205 pages indexed that create 404 errors on the new site. I am confident these links are causing Google to not rank well organically.
Additional information:
- Entirely new website
- Wordpress site
- New host
Should we be using the "Remove URLs" tool from Google to submit all 1205 of these pages? Do you think it will make a difference? This is down from the 22,500 URLs that existed when we started a few months back. Thank you in advance for any tips or suggestions!
-
Yes.
Disavow needed for each site (http/https).
-
Thanks for clearing this out.
If i have spammy links on http version, but my site is now https, i should upload the same disavow list on both http and https? (i saw one answer of yours in other thread saying just that , and i think is important because many of us are missing this detail) -
If they are not your - it's better to disavow them. If they are spammy - disavow them.
Those links may hurt your ranking.
-
Hi Pete, something in your answer got my attention.
Like one month ago , i saw some (as was proven later) spammy links pointing to one specific page of my site. Those links ( from 20+ domains) were coming from some german domain names with the ltd .xyz extension.
Now the links don't actually exists, but those referring pages saying 410 Gone (nginx server).
Is that bad for that spesific page of mine?
I never saw in past this http status. -
If your "bad" link is like http://OURDOMAIN/flibzy/foto-bugil-di-kelas.html then your .htaccess should be:
Redirect 410 /flibzy/foto-bugil-di-kelas.html
that's all.Yes - you should do this for ALL 1205 URLs. Don't do this on legal pages (before hacking), just on hacked pages. I say "gone" with 410 redirect. It's amazing. In your case gone for good. Time for identify that 1205 URLs and paste them into .htaccess is let's say X hours. Time for identify that 1205 URLs and temporary remove them is Y hours. Since "temporary removal" is up to 30 days this make same job each month. In total for one year you have X in first case and 12*Y in second case. You can see difference, right?
Also today Barry Adams release story about hacking:
http://www.stateofdigital.com/website-hacked-manual-penalty-google/
and it's amazing that site was hacked just for 4 hours but Google notice this. You can see there traffic drop and removal from SERP. Ok, i'm not trying to "fear sells", but keeping bad pages with 404 will take long time. In Jan-Feb 2012 i have new temporary site on mine site within /us/ folder and even today Jan 2016 i still receiving bots crawling this folder. That's why i nuke it with 410. This save the day!On your case it's same. Bot is wasting time and resources to crawl 404 pages over and over but crawling less your important pages. That's why it's good to nuke them. ONLY them. This will save bot crawling budget on your website. So bot can focus on your pages.
-
Hi Peter,
Thank you for your response! I saw you answered a similar question about a week ago, so thank you for weighing in on my options. So, to clarify, I must do this for all 1,205 of the URLs?
One SPAM link is pointing here: http://OURDOMAIN/flibzy/foto-bugil-di-kelas.html so in your above example, this would look like:
Redirect 410 /dir/http://OURDOMAIN/flibzy/foto-bugil-di-kelas.html/ (?) and do this for each page that Google has indexed?
I saw your example with the iphone on the other post. How did you get that page to say, GONE - The requested resource...
-
The best is to keep them 404. But fast is to 410 them.
All you need is to place this topmost somewhere of .htaccess:
Redirect 410 /dir/url1/
Redirect 410 /dir/url2/
Redirect 410 /dir1/url3/
Redirect 410 /dir1/url4/But this won't help you if your URLs have parameters somewhere like index.php?spamword1-blah-blah. For this you need extended version like this:
RewriteEngine on
#RewriteBase /
RewriteCond %{QUERY_STRING} spamword
RewriteRule ^(.)$ /404.html? [R=410,L]
RewriteCond %{QUERY_STRING} spamword1
RewriteRule ^(.)$ /404.html? [R=410,L]
RewriteCond %{QUERY_STRING} spamword2
RewriteRule ^(.*)$ /404.html? [R=410,L]So why 410? 410 act much faster than 404 but it's DANGEROUS! If you sent 410 to normal URL this is effective nuking it. I found that with 410 bot visit this url 1-2-3 times, but with 404 bot keep visiting over and over eating your crawling budget. URL removal in SearchConsole is OK, but it's fast but works only for 30 days. And will eat almost same time as building list for 404/410s. Hint: You can speedup crawling if you do "fetch and render" then submit to index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google index
Hello, I removed my site from google index From GWT Temporarily remove URLs that you own from search results, Status Removed. site not ranking well in google from last 2 month, Now i have question that what will happen if i reinclude site url after 1 or 2 weeks. Is there any chance to rank well when google re index the site?
Intermediate & Advanced SEO | | Getmp3songspk0 -
How to switch from URL based navigation to Ajax, 1000's of URLs gone
Hi everyone, We have thousands of urls generated by numerous products filters on our ecommerce site, eg./category1/category11/brand/color-red/size-xl+xxl/price-cheap/in-stock/. We are thinking of moving these filters to ajax in order to offer a better user experience and get rid of these useless urls. In your opinion, what is the best way to deal with this huge move ? leave the existing URLs respond as before : as they will disappear from our sitemap (they won't be linked anymore), I imagine robots will someday consider them as obsolete ? redirect permanent (301) to the closest existing url mark them as gone (4xx) I'd vote for option 2. Bots will suddenly see thousands of 301, but this is reflecting what is really happening, right ? Do you think this could result in some penalty ? Thank you very much for your help. Jeremy
Intermediate & Advanced SEO | | JeremyICC0 -
Do I need to re-index the page after editing URL?
Hi, I had to edit some of the URLs. But, google is still showing my old URL in search results for certain keywords, which ofc get 404. By crawling with ScremingFrog it gets me 301 'page not found' and still giving old URLs. Why is that? And do I need to re-index pages with new URLs? Is 'fetch as Google' enough to do that or any other advice? Thanks a lot, hope the topic will help to someone else too. Dusan
Intermediate & Advanced SEO | | Chemometec0 -
Google Webmaster Remove URL Tool
Hi All, To keep this example simple.
Intermediate & Advanced SEO | | Mark_Ch
You have a home page. The home page links to 4 pages (P1, P2, P3, P4). ** Home page**
P1 P2 P3 P4 You now use Google Webmaster removal tool to remove P4 webpage and cache instance. 24 hours later you check and see P4 has completely disappeared. You now remove the link from the home page pointing to P4. My Question
Does Google now see only pages P1, P2 & P3 and therefore allocate link juice at a rate of 33.33% each. Regards Mark0 -
Moving career site to new URL from main site. Will it hurt SEO for main page?
For one of our clients we are building a career site and putting it under a different URL and hosting service (mainly due to security concerns of hosting it under the same host and domain). almost 100% of the incoming traffic to their current career section (which it is in a sub-folder) receives traffic for branded keywords (brand + job/career/employment), that is, there are no job position specific keywords. The client is now worried that after moving the site, the inbound traffic to the main site will be severely affected as well as the SERP results. My questions are, will the non-career related SERPs be affected? I don't see how will they be but I could be wrong If no, how could we reassure her that the SEO to the main site wont be affected? are there any case studies of a similar case (splitting part of the website under a new URL and hosting service?) Thank you for your help. PS: this is my first post so please forgive me if this has been asked before. I could not find a good response.
Intermediate & Advanced SEO | | rflores0 -
Doing SEO for a Classifieds Website and need help
Hey All! I'm going to be doing SEO for a classifieds website (cars only). But I'm struggling to dream up an effective strategy. They have a reasonably popular blog that I could definitely use to publish content for link bait, and some opportunities for guest blogging. But in terms of straight up link building (think craigslist), I'm really at a loss as to where to start. If anyone has any experience working with Classifieds websites or can recommend any strategies that you think will work I'm all ears! Really appreciate any help that anyone has to offer! Cheers 🙂
Intermediate & Advanced SEO | | CD_20160 -
Incorrect cached page indexing in Google while correct page indexes intermittently
Hi, we are a South African insurance company. We have a page http://www.miway.co.za/midrivestyle which has a 301 redirect to http://www.miway.co.za/car-insurance. Problem is that the former page is ranking in the index rather than the latter. The latter page does index occasionally in the same position, but rarely. This is primarily for search phrases like "car insurance" and "car insurance quotes". The ranking was knocked down the index with Penquin 2.0. It was not ranking at all but we have managed to recover to 12/13. This abnormally has only been occurring since the recovery. The correct page does index for other search terms like "insurance for car". Your help would be appreciated, thanks!
Intermediate & Advanced SEO | | miway0 -
Content not indexed
How come i google content that resides on my website and on my homepage and my site doesn't come up? I know the content is unique i wrote that. I have a feeling i have some kind of a crawling issue but cannot determine what it is. I ran the crawling test and other tools and didn't find anything. Google shows me that pages are indexed but yet its weird try googling snippets of content and you'll see my site isnt anywhere. Have you experienced that before? First i thought it was penalized but i submitted the reconsideration request and it came back clear, No manual spam action found. And i did not get any message in my GWMT either. Any thoughts?
Intermediate & Advanced SEO | | CMTM0