Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
-
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred.
My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically?
Thanks in advance Moz community for your feedback.
-
If the urls follow any particular pattern then you can use a htaccess redirect and return the header code 410 / 403 / 404 to Google. (I suggest 410) They will soon drop out of the index.
I don't know the exact .htaccess syntax off the top of my head but it will be something like this:
If they all come from the same folder then it would look something like this:
RedirectMatch 410 ^/folder/.*$If they have a common character string after the forward slash (such as xyz) then it would look something like this:
RedirectMatch 410 ^/xyz.*$If they have any common character string footprints at all (such as xyz) then it would look something like this (now I'm guessing):
RedirectMatch 410 ^/()xyz.$This would be a pretty easy fix if all of those spammy urls have any common characters after the forward slash or they all originate from a certain folder.
-
You might get a little quicker removal if you send them with a 410 status code. That will let Google know that the page is gone for good. http://searchenginewatch.com/sew/how-to/2340728/matt-cutts-on-how-google-handles-404-410-status-codes
-
No problem at all! These new URLs do not actually exist on the website. Since we cleaned up the malicious code all of these URLs redirect to our 404 page.
-
Sorry to misunderstand the problem. Do those new urls actually exist on your site or just in search?
-
Hi 94501,
Thanks for taking the time to respond. Just to be clear, we are not submitting multiple removals for the same URL and I don't think Google WMT even allows you to do that. Completely new URLs are appearing each day after removing the older ones.
My main concern is having spammy URLs indexed and associated with my website and the negative effects it can have from an SEO perspective.
-
Hi Pete,
It sounds like you've done what you can. I wouldn't submit multiple removals for the same url.
I assume it's out of your site map and you're not still being hacked and have figured out how it happened and taken steps to fix it.
Google will eventually figure it out. I'd try to move on to new stuff.
Best... Mike
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Displaying Vanity URL in Google Search Result
Hi Moz! Not sure if this has been asked before, but is there any way to tell Google to display a vanity URL (that has been 301d) instead of the actual URL in the SERP? Example: www.domainA.com is a vanity URL (bought specifically for Brand Identity reasons) that redirects to www.domainB.com. Is it possible to have the domainA Url show up in Google for a Branded search query? Thanks in advance! Arjun
Intermediate & Advanced SEO | | Lauriedechaseaux0 -
How does educational organization schema interact with Google's knowledge graph?
Hi there! I was just wondering if the granular options of the Organization schema, like Educational Organization (http://schema.org/EducationalOrganization) and CollegeOrUniversity (http://schema.org/CollegeOrUniversity) schema work the same when it comes to pulling data into the knowledge graph. I've typically always used the Organization schema for customers but was wondering if there are any drawbacks for going deep into the hierarchy of schema. Cheers 😄
Intermediate & Advanced SEO | | Corbec8880 -
What's the best possible URL structure for a local search engine?
Hi Mozzers, I'm working at AskMe.com which is a local search engine in India i.e if you're standing somewhere & looking for the pizza joints nearby, we pick your current location and share the list of pizza outlets nearby along with ratings, reviews etc. about these outlets. Right now, our URL structure looks like www.askme.com/delhi/pizza-outlets for the city specific category pages (here, "Delhi" is the city name and "Pizza Outlets" is the category) and www.askme.com/delhi/pizza-outlets/in/saket for a category page in a particular area (here "Saket") in a city. The URL looks a little different if you're searching for something which is not a category (or not mapped to a category, in which case we 301 redirect you to the category page), it looks like www.askme.com/delhi/search/pizza-huts/in/saket if you're searching for pizza huts in Saket, Delhi as "pizza huts" is neither a category nor its mapped to any category. We're also dealing in ads & deals along with our very own e-commerce brand AskMeBazaar.com to make the better user experience and one stop shop for our customers. Now, we're working on URL restructure project and my question to you all SEO rockstars is, what can be the best possible URL structure we can have? Assume, we have kick-ass developers who can manage any given URL structure at backend.
Intermediate & Advanced SEO | | _nitman0 -
Removing UpperCase URLs from Indexing
This search - site:www.qjamba.com/online-savings/automotix gives me this result from Google: Automotix online coupons and shopping - Qjamba
Intermediate & Advanced SEO | | friendoffood
https://www.qjamba.com/online-savings/automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. and Google tells me there is another one, which is 'very simliar'. When I click to see it I get: Automotix online coupons and shopping - Qjamba
https://www.qjamba.com/online-savings/Automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. This is because I recently changed my program to redirect all urls with uppercase in them to lower case, as it appears that all lowercase is strongly recommended. I assume that having 2 indexed urls for the same content dilutes link juice. Can I safely remove all of my UpperCase indexed pages from Google without it affecting the indexing of the lower case urls? And if, so what is the best way -- there are thousands.0 -
What's the deal with significantLinks?
http://schema.org/significantLink Schema.org has a definition for "non-navigation links that are clicked on the most." Presumably this means something like the big green buttons on Moz's homepage. But does anyone know how they affect anything? In http://moz.com/blog/schemaorg-a-new-approach-to-structured-data-for-seo#comment-142936, Jeremy Nelson says " It's quite possible that significant links will pass anchor text as well if a previous link to the page was set in navigation, effictively making obselete the first-link-counts rule, and I am interested in putting that to test." This is a pretty obscure comment but it's one of the only results I could find on the subject. Is this BS? I can't even make out what all of it is saying. So what's the deal with significantLinks and how can we use them to SEO?
Intermediate & Advanced SEO | | NerdsOnCall0 -
Google News URL Structure
Hi there folks I am looking for some guidance on Google News URLs. We are restructuring the site. A main traffic driver will be the traffic we get from Google News. Most large publishers use: www.site.com/news/12345/this-is-the-title/ Others use www.example.com/news/celebrity/12345/this-is-the-title/ etc. www.example.com/news/celebrity-news/12345/this-is-the-title/ www.example.com/celebrity-news/12345/this-is-the-title/ (Celebrity is a channel on Google News so should we try and follow that format?) www.example.com/news/celebrity-news/this-is-the-title/12345/ www.example.com/news/celebrity-news/this-is-the-title-12345/ (unique ID no at the end and part of the title URL) www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ Others include the date. So as you can see there are so many combinations and there doesnt seem to be any unity across news sites for this format. Have you any advice on how to structure these URLs? Particularly if we want to been seen as an authority on the following topics: fashion, hair, beauty, and celebrity news - in particular "celebrity name" So should the celebrity news section be www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ or what? This is for a completely new site build. Thanks Barry
Intermediate & Advanced SEO | | Deepti_C0 -
How to deal with old, indexed hashbang URLs?
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/). My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries. So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #). I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way? If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue. Best, -G
Intermediate & Advanced SEO | | Celts180 -
Is 404'ing a page enough to remove it from Google's index?
We set some pages to 404 status about 7 months ago, but they are still showing in Google's index (as 404's). Is there anything else I need to do to remove these?
Intermediate & Advanced SEO | | nicole.healthline0