How to de-index a page with a search string with the structure domain.com/?"spam"
-
The site in question was hacked years ago. All the security scans come up clean but the seo crawlers like semrush and ahrefs still show it as an indexed page. I can even click through on it and it takes me to the homepage with no 301. Where is the page and how to deindex it?
domain/com/?spam
There are multiple instances of this.
http://www.clipular.com/c/5579083284217856.png?k=Q173VG9pkRrxBl0b5prNqIozPZI
-
You are most welcome. I'm glad to hear your road to site recovery is coming along. I'm also glad to confirm that, to all of my knowledge, your understanding of the "*" operator and Disallow /?spam string is correct. One more thing:
Fetch as Google and Request Indexing
Apologies, I neglected to mention this step in my answer. It should be included. This is the best tool I'm aware of to ask Google, "hey, crawl me please." Do this after you upload your shiny new robots.txt.In GSC, under Crawl, select Fetch as Google. Then, select Fetch and Render. When status is partial or complete, click Request Indexing. There is no guarantee here, and my experience is Google does what it wants. Even so, I've seen results in less than 2 hours (full disclosure: the longest I've waited has been 3 days).
Penalty Free I agree. They cannot possibly be penalizing your site. At least, not purposefully. You have taken all recommended actions and then some to resolve site issues. Even if you do have a few bad back links floating around out there from some blackhat t3 site PBN, Penguin 4.0 should discredit that bad link juice. Your site doesn't even have the offending pages. It's just a matter of time before Google's index lines back up with your live site.
Good Work Sir,
Wipe the Index Clean,
CopyChrisSEO and the Vizergy Team -
Thanks very much for your explanation.
I have gone ahead and temporarily blocked the pages in GSC.
I am working on the robot.txt and see there are no instructions for the crawlers to skip over these urls in question.
I understand that I should use the "*" operator to alert all crawlers to disallow the pages in this format:
user-agent: *
Disallow: /?spam string
Finally, I will send the suggested edit to Google and see where that gets me. Honestly, at this point, they cannot possibly be penalized the site any worse so anything working towards cleaning up the index for the site will be a step in the right direction.
-
Hello Miamirealestatetrendsguy and fellow Mozers,
It sounds like you have had a crazy time handling this hack. Good news is, as far as I can tell from the given information, you are close to resolution. Googlebot should correct the indexed pages over time. I'm certain you would like to expedite that process. Here are three recommendations that come to mind: Remove URLs via GSC, block the offending URLs via robots.txt, and suggest edits in Google's SERPs.
Remove URLs via GSC
In GSC, under Google Index, select Remove URLs. This suppression is temporary however. Click on more information for more about that. My experience with it as been suppression for a few months. Don't worry about the time though. Our next step should take affect before your time is up.Block the Offending URLs via Robots.txt
Before you do this, be very certain what you are doing. After you are confident, list your offending URLs, edit the offending URLs as noindex nofollow in your robots.txt, and upload it. Hopefully, you can find commonalities to shorten this list and save your time.Note: I have purposefully avoided the details on how to this here because it is vital SEOs learn how to do it with full knowledge of potential risks as well as how to avoid those risks. Here are some resources:
• Google Support • Moz's Robots.txt Rundown
• Search Engine Land's Deeper LookSuggest Edits in Google's SERPs This one is iffy, and I really don't trust Google using this feedback. However, I have done it and it worked more than once. Find your offending results and send specific feedback.
Wipe that Index Clean,
CopyChrisSEO and the Vizergy Team
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need to de-index certain pages fast
I need to de-index certain pages as fast as possible. These pages are already indexed. What is the fastest way to do this? I have added the noindex meta tag and run a few of the pages through Search Console/Webmaster tools (fetch as google) earlier today, however nothing has changed yet. The 'fetch as google' services do see the noindex tag, but it haven't changed the SERPs yet. I now I should be patient, but if there is a faster way to get Google to de-index these pages, I want to try that. I am considering the removal tool also, but I'm unsure if that is risky to do. And even if it's not, I can understand it's not a permanent solution anyway. What to do?
Technical SEO | | WebGain0 -
How to prevent duplicat content issue and indexing sub domain [ CDN sub domain]?
Hello! I wish to use CDN server to optimize my page loading time ( MaxCDN). I have to use a custom CDN sub domain to use these services. If I added a sub domain, then my blog has two URL (http://www.example.com and http://cdn.example.com) for the same content. I have more than 450 blog posts. I think it will cause duplicate content issues. In this situation, what is the best method (rel=canonical or no-indexing) to prevent duplicate content issue and prevent indexing sub domain? And take the optimum service of the CDN. Thanks!
Technical SEO | | Godad0 -
Duplicate content /index.php/ issues
I'm having some duplicate content issues with Google. I've already got my .htaccess file working just fine as far as I can tell. Rewriting works great, and by using the site you'd never end up on a page with /index.php. However I do notice that on ANY page of the site you could add /index.php and get the same page i.e.: www.mysite.com/category/article and www.mysite.com/index.php/category/article Would both return the same page. How can I 301 or something similar all /index.php pages to the non index.php version? I have no desire for any page on my site to have index.php in it, there is no use to it. Having quite the hard time figuring this out. Again this is basically just for the robots, the URL's the users see are perfect, never had an issue with that. Just SEOMOZ reporting duplicate content and I've verified that to be true.
Technical SEO | | b18turboef1 -
Is a Rel="cacnonical" page bad for a google xml sitemap
Back in March 2011 this conversation happened. Rand: You don't want rel=canonicals. Duane: Only end state URL. That's the only thing I want in a sitemap.xml. We have a very tight threshold on how clean your sitemap needs to be. When people are learning about how to build sitemaps, it's really critical that they understand that this isn't something that you do once and forget about. This is an ongoing maintenance item, and it has a big impact on how Bing views your website. What we want is end state URLs and we want hyper-clean. We want only a couple of percentage points of error. Is this the same with Google?
Technical SEO | | DoRM0 -
Noindex Pages indexed
I'm having problem that gogole is index my search results pages even though i have added the "noindex" metatag. Is the best thing to block the robot from crawling that file using robots.txt?
Technical SEO | | Tedred0 -
Would you advise or not to include a sites name at the end of each title tage? Example "Top 3 things to do in New York | yoursite.com"
Having a discussion on if we should include our domain at the end of each site page. Re read over the following pages and question remove the brand to keep under 70 characters? or Shorten the page Title before the brand? Let the title exceed 70 characters? What is your thoughts?
Technical SEO | | Harley2g0 -
Issue with .uk.com domain
hi i have rockshore.uk.com which is not indexing properly. the internal pages do not show up for the text they have on them, or the title tags. the site is on aekmps shops platform. I understand that a .uk.com is not a proper TLD but i think i have a subdomain of .uk.com Can anyone help? thanks
Technical SEO | | Turkey0