After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?
-
I'm working on a site that was hacked in March 2019 and in the process, nearly 900,000 spam links were generated and indexed. After remediation of the hack in April 2019, the spammy URLs began dropping out of the index until last week, when Search Console showed around 8,000 as "Indexed, not submitted in sitemap" but listed as "Valid" in the coverage report and many of them are still hack-related URLs that are listed as being indexed in March 2019, despite the fact that clicking on them leads to a 404. As of this Saturday, the number jumped up to 18,000, but I have no way of finding out using the search console reports why the jump happened or what are the new URLs that were added, the only sort mechanism is last crawled and they don't show up there.
How long can I expect it to take for these remaining urls to also be removed from the index? Is there any way to expedite the process? I've submitted a 'new' sitemap several times, which (so far) has not helped.
Is there any way to see inside the new GSC view why/how the number of valid URLs in the indexed doubled over one weekend?
-
Google Search Console actually has a URL removal tool built into it, unfortunately it's not really scaleable (mostly it's one at a time submissions) and in addition to that the effect of using the tool is only temporary (the URLs come back again)
In your case I reckon' that changing the status code of the 'gone' URLs from 404 ("temporarily not found, but will be returning soon") to 410 ("GONE!") might be a good idea. Google might digest that better as it's a harder indexation directive and a very strong crawl directive ("go away, don't come back!")
You could also serve the Meta no-index directive on those URLs. Obviously you're unlikely to have access to the HTML of non-existent pages, but did you know Meta no-index can also be fired through x-robots, through the HTTP header? So it's not impossible
https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404
(Ctrl+F for "X-Robots-Tag HTTP header")
Another option is this form to let Google know outdated content is gone, has been removed, and isn't coming back:
https://www.google.com/webmasters/tools/removals
... but again, URLs one at a time is going to be mega-slow. It does work pretty well though (at least in my experience)
In any eventuality I think you're looking at, a week or two for Google to start noticing in a way that you can see visually - and then maybe a month or two until it rights itself (caveat: it's different for all sites and URLs, it's variable)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Articles marked with "This site may be hacked," but I have no security issues in the search console. What do I do?
There are a number of blog articles on my site that have started receiving the "This site may be hacked" warning in the SERP. I went hunting for security issues in the Search Console, but it indicated that my site is clean. In fact, the average position of some of the articles has increased over the last few weeks while the warning has been in place. The problem sounds very similar to this thread: https://productforums.google.com/forum/#!category-topic/webmasters/malware--hacked-sites/wmG4vEcr_l0 but that thread hasn't been touched since February. I'm fearful that the Google Form is no longer monitored. What other steps should I take? One query where I see the warning is "Brand Saturation" and this is the page that has the warning: http://brolik.com/blog/should-you-strive-for-brand-saturation-in-your-marketing-plan/
Intermediate & Advanced SEO | | Liggins0 -
Google Search Listing With Feedback Link
Where can I find some information on the new Google search listing that shows a Feedback link? How does one get this type of Google search listing?
Intermediate & Advanced SEO | | marketvantageteam0 -
URL Spoof Issue in Search Results
Hello! We could use some assistance diagnosing an issue. In order to avoid asking a convoluted question, I will try to break it down below: 1. A random foreign site is hacked and a subdirectory is added that is completely irrelevant to the root. a). i.e. http://www.um.org/prom_dresses/ 2. http://www.um.org/prom_dresses/ is just a phishing prom dress page 3. When you search "prom dress shop", the website that used to rank first (for good reason) was www.promdressshop.com. 4. www.promdressshop.com's home page has now been replaced by: um.org/prom_dresses/ – who is using prom dress shop's title tag and meta description. How is it possible that this hacked page (on um.org) is not only ranking above us, but is also starting to replace www.promdressshop.com's pages in search results. We do not believe www.promdressshop.com has been hacked but are open to any ideas. Please let me know if you would like any additional info. Thanks in advance! new
Intermediate & Advanced SEO | | LogicalMediaGroup0 -
Cached Alternate URL appearing as base page
Hi there, I'm currently targeting Australia and the US for one of my web-pages. One of my web-pages begin with a subdomain (au.site.com) and the other one is just the root domain (site.com). After searching the website on Australian Google and checking the description and title, it keeps the US ones (i.e. root domain) and after checking the cached copy, it was cached earlier today but it is displayed exactly as the American website when it is supposed to be the Australian one? In the url for the caching it appears as au.site.com while displaying the American page's content. Any ideas why? Thanks, Oliver
Intermediate & Advanced SEO | | oliverkuchies0 -
Google Mobile Friendly designation in Search results
We have recently deployed a mobile (http://m.pssl.com) version of our desktop website (http://www.pssl.com). We've followed the guidelines in their documentation (https://support.google.com/webmasters/answer/6101188) & (http://googlewebmastercentral.blogspot.com/2015/04/rolling-out-mobile-friendly-update.html), added the appropriate rel=alternate/rel=canonical tags updated site maps and robots.txt files, etc. A mobile search for our company shows the "mobile-friendly" flag in the search results for our home page, but for some reason other pages such as category and brand are not showing showing as "mobile-friendly". I can submit the pages using the mobile-friendly tester (https://www.google.com/webmasters/tools/mobile-friendly/) and all of the pages I test come back as mobile friendly. Does anyone have any experience or advice they'd be willing to share that might help us resolve this issue?
Intermediate & Advanced SEO | | ovenbird0 -
How to NOT appear in Google results in other countries?
I have ecommerce sites the only serve US and Canada. Is there a way to prevent a site from appearing in the Google results in foreign countries? The reason I ask is that we also have a lot of informational pages that folks in other countries are visiting, then leaving right after reading. This is making our overall Bounce Rate very high (64%). When we segment the GA data to look at just our US visitors, then the Bounce Rate drops a lot. (to 48%) Thanks!
Intermediate & Advanced SEO | | GregB1230 -
Are This Site's Backlinks Hurting Us?
Google WMT reports more than 198,000 backlinks to our site (www.audiobooksonline.com) from http://dilandau.eu/? We have never been notified by Google of any penalty, malware notification... but continue to struggle to get our page 1 Google ranking back since Panda. Could these backlinks be hurting our Google ranking? Should I implement a disavow rule for http://dilandau.eu/?
Intermediate & Advanced SEO | | lbohen0 -
Is User Agent Detection still a valid method for blocking certain URL parameters from the Search Engines?
I'm concerned with the cloaking issue. Has anyone successfully implemented user agent detection to provide the Search engines with "clean" URLs?
Intermediate & Advanced SEO | | MyaRiemer0