URL Index Removal for Hacked Website - Will this help?
-
My main question is: How do we remove URLs (links) from Google's index and the 1000s of created 404 errors associated with them after a website was hacked (and now fixed)?
The story: A customer came to us for a new website and some SEO. They had an existing website that had been hacked and their previous vendor was non-responsive to address the issue for months. This created THOUSANDS of URLs on their website that were then linked to pornographic and prescription med SPAM sites. Now, Google has 1,205 pages indexed that create 404 errors on the new site. I am confident these links are causing Google to not rank well organically.
Additional information:
- Entirely new website
- Wordpress site
- New host
Should we be using the "Remove URLs" tool from Google to submit all 1205 of these pages? Do you think it will make a difference? This is down from the 22,500 URLs that existed when we started a few months back. Thank you in advance for any tips or suggestions!
-
Yes.
Disavow needed for each site (http/https).
-
Thanks for clearing this out.
If i have spammy links on http version, but my site is now https, i should upload the same disavow list on both http and https? (i saw one answer of yours in other thread saying just that , and i think is important because many of us are missing this detail) -
If they are not your - it's better to disavow them. If they are spammy - disavow them.
Those links may hurt your ranking.
-
Hi Pete, something in your answer got my attention.
Like one month ago , i saw some (as was proven later) spammy links pointing to one specific page of my site. Those links ( from 20+ domains) were coming from some german domain names with the ltd .xyz extension.
Now the links don't actually exists, but those referring pages saying 410 Gone (nginx server).
Is that bad for that spesific page of mine?
I never saw in past this http status. -
If your "bad" link is like http://OURDOMAIN/flibzy/foto-bugil-di-kelas.html then your .htaccess should be:
Redirect 410 /flibzy/foto-bugil-di-kelas.html
that's all.Yes - you should do this for ALL 1205 URLs. Don't do this on legal pages (before hacking), just on hacked pages. I say "gone" with 410 redirect. It's amazing. In your case gone for good. Time for identify that 1205 URLs and paste them into .htaccess is let's say X hours. Time for identify that 1205 URLs and temporary remove them is Y hours. Since "temporary removal" is up to 30 days this make same job each month. In total for one year you have X in first case and 12*Y in second case. You can see difference, right?
Also today Barry Adams release story about hacking:
http://www.stateofdigital.com/website-hacked-manual-penalty-google/
and it's amazing that site was hacked just for 4 hours but Google notice this. You can see there traffic drop and removal from SERP. Ok, i'm not trying to "fear sells", but keeping bad pages with 404 will take long time. In Jan-Feb 2012 i have new temporary site on mine site within /us/ folder and even today Jan 2016 i still receiving bots crawling this folder. That's why i nuke it with 410. This save the day!On your case it's same. Bot is wasting time and resources to crawl 404 pages over and over but crawling less your important pages. That's why it's good to nuke them. ONLY them. This will save bot crawling budget on your website. So bot can focus on your pages.
-
Hi Peter,
Thank you for your response! I saw you answered a similar question about a week ago, so thank you for weighing in on my options. So, to clarify, I must do this for all 1,205 of the URLs?
One SPAM link is pointing here: http://OURDOMAIN/flibzy/foto-bugil-di-kelas.html so in your above example, this would look like:
Redirect 410 /dir/http://OURDOMAIN/flibzy/foto-bugil-di-kelas.html/ (?) and do this for each page that Google has indexed?
I saw your example with the iphone on the other post. How did you get that page to say, GONE - The requested resource...
-
The best is to keep them 404. But fast is to 410 them.
All you need is to place this topmost somewhere of .htaccess:
Redirect 410 /dir/url1/
Redirect 410 /dir/url2/
Redirect 410 /dir1/url3/
Redirect 410 /dir1/url4/But this won't help you if your URLs have parameters somewhere like index.php?spamword1-blah-blah. For this you need extended version like this:
RewriteEngine on
#RewriteBase /
RewriteCond %{QUERY_STRING} spamword
RewriteRule ^(.)$ /404.html? [R=410,L]
RewriteCond %{QUERY_STRING} spamword1
RewriteRule ^(.)$ /404.html? [R=410,L]
RewriteCond %{QUERY_STRING} spamword2
RewriteRule ^(.*)$ /404.html? [R=410,L]So why 410? 410 act much faster than 404 but it's DANGEROUS! If you sent 410 to normal URL this is effective nuking it. I found that with 410 bot visit this url 1-2-3 times, but with 404 bot keep visiting over and over eating your crawling budget. URL removal in SearchConsole is OK, but it's fast but works only for 30 days. And will eat almost same time as building list for 404/410s. Hint: You can speedup crawling if you do "fetch and render" then submit to index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help with structure for optimizing Photography Website SEO
Hey guys , I am building a photography website and currently I have it setup the following way for my image galleries : https://ricfrancophotography.com/portfolio/norway-landscape-photography/#!gallery6618-6577 This provides me with an individual link for each of the images in my photography gallery but as a gallery these have obviously no content and I figured the best way would be to add the images I want to work my SEO for to individual blog posts . So here is what I did so far : - Added a link to the caption of each image inside the lightbox that is linking to the individual blog post - In order to not break the navigation I made the post with the content for each image open in a modal popup (it changes the link in the top bar but once closed it goes back to the gallery) . - I made the image inside the post link back to the fullsize image in the lightbox gallery when clicked instead of linking to the .jpg in wp-content/uploads. Now, I have some questions regarding whether this is a good practice in terms of SEO and if the fact of having duplicate images or this structure is going to hurt my SEO any way . Although both images are in different urls they ultimately link to each other this way : Blog Image --> Gallery Image url --> wp-content/uploads/file.jpg Is there a better approach for this ? Thanks
Intermediate & Advanced SEO | | ricfranco0 -
How to stop URLs that include query strings from being indexed by Google
Hello Mozzers Would you use rel=canonical, robots.txt, or Google Webmaster Tools to stop the search engines indexing URLs that include query strings/parameters. Or perhaps a combination? I guess it would be a good idea to stop the search engines crawling these URLs because the content they display will tend to be duplicate content and of low value to users. I would be tempted to use a combination of canonicalization and robots.txt for every page I do not want crawled or indexed, yet perhaps Google Webmaster Tools is the best way to go / just as effective??? And I suppose some use meta robots tags too. Does Google take a position on being blocked from web pages. Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
How do I redirect my old PHP website to my new Java website?
Please could you help? My old website is written in php. I've created a new design of the website in Java. I'll be using the same domain name though. example.com and I'd like to pass my link juice to my new redesigned website. When I turn the domain name to point to my new website how do I make sure pages that are ranked in google that don't exist on my new website transfer 301 from my old website to a similar page on my new website. Old Website Example example.com/bootcampuk.php New Website Example example.com/bootcamps.jsp Many Thanks, Rob
Intermediate & Advanced SEO | | puamethod0 -
Does Google Index URLs that are always 302 redirected
Hello community Due to the architecture of our site, we have a bunch of URLs that are 302 redirected to the same URL plus a query string appended to it. For example: www.example.com/hello.html is 302 redirected to www.example.com/hello.html?___store=abc The www.example.com/hello.html?___store=abc page also has a link canonical tag to www.example.com/hello.html In the above example, can www.example.com/hello.html every be Indexed, by google as I assume the googlebot will always be redirected to www.example.com/hello.html?___store=abc and will never see www.example.com/hello.html ? Thanks in advance for the help!
Intermediate & Advanced SEO | | EcommRulz0 -
Blocking Certain Site Parameters from Google's Index - Please Help
Hello, So we recently used Google Webmaster Tools in an attempt to block certain parameters on our site from showing up in Google's index. One of our site parameters is essentially for user location and accounts for over 500,000 URLs. This parameter does not change page content in any way, and there is no need for Google to index it. We edited the parameter in GWT to tell Google that it does not change site content and to not index it. However, after two weeks, all of these URLs are still definitely getting indexed. Why? Maybe there's something we're missing here. Perhaps there is another way to do this more effectively. Has anyone else ran into this problem? The path we used to implement this action:
Intermediate & Advanced SEO | | Jbake
Google Webmaster Tools > Crawl > URL Parameters Thank you in advance for your help!0 -
Thousands of /img/img/img urls generated by website - where are they coming from?
Hello -just fed website into Screaming Frog and ended up crashing computer as these img/img/img urls went into the 10s of thousands (and the numbers of img/img/img/ in each URL ended up going into the dozens and probably hundreds and more per URL). Never seen anything like it! Any idea what might be going on with this website and why it's generating so many of these URLs - it is anything to worry about? Here's example of shorter URL... www.company.com/discover/img/img/img/img/img/img/img/img/img/img/img/img/img/img/img/img/photo-competition-winners
Intermediate & Advanced SEO | | McTaggart0 -
Volusion store product pages will not index
Hello, I have moved over to Volusion and was wondering if you guys know of any SEO practices that are Volusion specific. i have been working on this site now for 2 months and my impressions and rankings have dropped substantially My 301 redirects where in place before I flipped over and my keywords / titles/ tags etc.. are in place. However i am still not making any progress in the engines. I have noticed that my products are not being indexed per Webmaster tools. I have heard that volusion has something set up to where you must purchase their SEO package in order to rank. I am really at my wits end and currently I thinking about taking a loss and reverting back to my old Shoppe Pro site. Any help would be very appreciated
Intermediate & Advanced SEO | | kerry0217
.0 -
Adding index.php at the end of the url effect it's rankings
I have just had my site updated and we have put index.php at the end of all the urls. Not long after the sites rankings dropped. Checking the backlinks, they all go to (example) http://www.website.com and not http://www.website.com/index.php. So could this change have effected rankings even though it redirects to the new url?
Intermediate & Advanced SEO | | authoritysitebuilder0