Open Site Explorer - Top Pages that don't exist / result of a hack(?)
-
Hi all,
Last year, a website I monitor, got hacked, or infected with malware, I’m not sure which.
The result that I got to see is 100’s of ‘not found’ entries in Google Search Console / Crawl Errors for non-existent pages relating to / variations of ‘Canada Goose’. And also, there's a couple of such links showing up in SERPs. Here’s an example of the page URLs:
ourdomain.com/canadagoose.php ourdomain.com/replicacanadagoose.php
I looked for advice on the webmaster forums, and was recommended to just keep marking them as ‘fixed’ in the console. Sooner or later they’ll disappear. Still, a year after, they appear.
I’ve just signed up for a Moz trail and, in Open Site Explorer->Top Pages, the top 2-5 pages are relating to these non-existent pages: URLs that are the result of this ‘canada goose’ spam attack. The non-existent pages each have around 10 Linking Root Domains, with around 50 Inbound Links.
My question is: Is there a more direct action I should take here? For example, informing Google of the offending domains with these backlinks.
Any thoughts appreciated! Many thanks
-
Hi Mª Verónica B
That's great, Many thanks for the confirmation.
All the best,
Colin
-
Hi Colin,
If the backlinks/inbound links are spam, yes upload a disavow file, only related to those.
If multiple ghost pages in WordPress due to erased hacked pages, yes the new hidden page with all the above instructions, only related to the spam pages.
All the best,Mª Verónica B.
-
Thanks again Mª Veronica for taking the time to respond.
Ok, if i understand correctly, as those spam / 'canadagoose' related backlinks do indeed exist* , a disavow file for google would be the thing to do here?
There was indeed a hacking, which happened before i came along, which is reported in Google Search Console. And there are 100's of 'canadagoose' related crawl errors with a response code of 404 that just keep coming back. It looks like those pages did indeed once exist, and must have been deleted by the website developers. So the 'empty page' technique would apply here?
*It seems to me that the 'canadagoose' pages that have apparently since been deleted , and the backlinks linking to those 'ghost' pages, are all part of the hack:
- hack website, create 'canadagoose' pages
- link to 'canadagoose' pages from other websites
Many thanks,
Colin
-
Hi Colin,
Not exactly!
We are not talking about backlinks. Backlinks come from other websites, therefore we cannot control them, except upload a disavow file for Google.
That is quite different.We are talking about the hundred of "ghosts" of deleted pages - we deleted them, because our website was hacked.
At the time we deleted all those, that is not enough.
Crawlers will "see" 330 or more pages with 404 status!
That is awful for SEO, due to the crawlers/Google "understands" that you do not care about user experience, means you have so many erased pages that if somebody goes there, there will be nothing.1.- Moz Top Pages to find out all the spam pages and then Google, of course to be sure.
2.- A new page, not completely empty. Should say something like
"We are truly sorry... Thanks."
This is for the crawlers, it is supposed no human knows about those spam pages. Except the one that hacked your website.
3.- Redirect all spam pages in the list, with a 301 - a permanent redirect to the noindex/nofollow page you just created.
4:- Verify, copy and past from the list into the navigator and check if goes to the new page, also verify the page status with Moz bar.Thanks. Good luck,
Mª Verónica
-
Many thanks for your response Mª Veronica B, very helpful.
I've never used the disavow backlinks tool in Google Search Console. I would have assumed this is the ideal scenario to use it to disavow _specific _backlinks (not _all _backlinks). But instead what you're suggesting is:
-
create an empty / hidden (WordPress) Page, and make it noindex / nofollow
-
Get a list of all spam backlinks from Google Search Console
-
Redirect all spam backlinks in the list to the empty noindex / nofollow page
This would never have occurred to me, I'm going to do this right now.
Again, many thanks!
Colin
-
-
Hi,
It seems that the website has a similar situation as the one that I shared before.
Although, I had to take immediate action due to it was creating a very serious problem by sending malicious signals to all the crawlers.
Also, I discovered the issue by using the same Moz feature.
Thanks Moz!https://moz.com/community/q/more-than-450-pages-created-by-a-hacker
In my experience, by sending all those pages to a new hidden page, using a 301 and the noindex and nofollow directives. It is, somehow, sending the right signals to the crawlers of Google and the other search engines.
Let's say strongly informing all the crawlers, that those spam pages/404 are not relevant nor interesting for your website.
Andy's response agrees that is the best solution. Also, he recommends the Wordfence plugin for WordPress as a preventive measure to avoid further issues.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I need help on how best to do a complicated site migration. Replacing certain pages with all new content and tools, and keeping the same URL's. The rest just need to disappear safely. Somehow.
I'm completely rebranding a website but keeping the same domain. All content will be replaced and it will use a different theme and mostly new plugins. I've been building the new site as a different site in Dev mode on WPEngine. This means it currently has a made-up domain that needs to replace the current site. I know I need to somehow redirect the content from the old version of the site. But I'm never going to use that content again. (I could transfer it to be a Dev site for the current domain and automatically replace it with the click of a button - just as another option.) What's the best way to replace blahblah.com with a completely new blahblah.com if I'm not using any of the old content? There are only about 4 URL'st, such as blahblah.com/contact hat will remain the same - with all content replaced. There are about 100 URL's that will no longer be in use or have any part of them ever used again. Can this be done safely?
Intermediate & Advanced SEO | | brickbatmove1 -
Our Web Site Is candere.com. Its PA and back link status are different for https://www.candere.com, http://www.candere.com, https://candere.com, and http://candere.com. Recently, we have completely move from http to https.
How can we fix it, so that we may mot lose ranking and authority.
Intermediate & Advanced SEO | | Dhananjayukumar0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
Https://www.mywebsite.com/blog/tag/wolf/ setting tag pages as blog corner stone article?
We do not have enough content rich page to target all of our keywords. Because of that My SEO guy wants to set some corner stone blog articles in order to rank them for certain key words on Google. He is asking me to use the following rule in our article writing(We have blog on our website):
Intermediate & Advanced SEO | | AlirezaHamidian
For example in our articles when we use keyword "wolf", link them to the blog page:
https://www.mywebsite.com/blog/tag/wolf/
It seems like a good idea because in the tag page there are lots of material with the Keyword "wolf" . But the problem is when I search for keyword "wolf" for example on the Google, some other blog pages are ranked higher than this tag page. But he tells me in long run it is a better strategy. Any idea on this?0 -
Been away for a while is SEO really dead ? I don't think so...
I have been struggling with the google updates but recently we started a new project and by using guest blog posts we were able to achieve a top 3 ranking. It delivered traffic and sales so SEO still works. This is my understanding of the current situation - 1. Generic Keywords (forget it) 2. Go niche and long tail (but thats been the case for a while right) 3. Using related searches 4. Incoming links using brands and a wider range of phrases and urls. 5. Content thats sharable 6. Google plus buttons etc This is my current understanding I would love to hear your thoughts.
Intermediate & Advanced SEO | | onlinemediadirect0 -
Site Search Results in Index -- Help
Hi, I made a mistake on my site, long story short, I have a bunch of search results page in the Google index. (I made a navigation page full of common search terms, and made internal links to a respective search results page for each common search term.) Google crawled the site, saw the links and now those search results pages are indexed. I made versions of the indexed search results pages into proper category pages with good URLs and am ready to go live/ replace the pages and links. But, I am a little unsure how to do it /what the effects can be: Will there be duplicate content issues if I just replace the bad, search results links/URLs with the good, category page links/URLs on the navi. page? (is a short term risk worth it?) Should I get the search results pages de-indexed first and then relaunch the navi. page with the correct category URLs? Should I do a robots.txt disallow directive for search results? Should I use Google's URL removal tool to remove those indexed search results pages for a quick fix, or will this cause more harm than good? Time is not the biggest issue, I want to do it right, because those indexed search results pages do attract traffic and the navi. page has been great for usability. Any suggestions would be great. I have been reading a ton on this topic, but maybe someone can give me more specific advice. Thanks in advance, hopefully this all makes sense.
Intermediate & Advanced SEO | | IOSC1 -
Multiple URL's exist for the same page, canonicaliazation issue?
All of the following URL's take me to the same page on my site: 1. www.mysite.com/category1/subcategory.aspx 2. www.mysite.com/subcategory.aspx 3. www.mysite.com/category1/category1/category1/subcategory.aspx All of those pages are canonicalized to #1, so is that okay? I was told the following my a company trying to make our sitemap: "the site's platform dynamically creates URLs that resolve as 200 and should be 404. This is a huge spider trap for any search engine and will make them wary of crawling the site." What would I need to do to fix this? Thanks!
Intermediate & Advanced SEO | | pbhatt0 -
How 'Off Topic' can I go - site wide?
Hello, I am currently number 1 for a competitive keyword - so don't want to push the wrong button and self destruct! My site is highly focused on one relatively narrow niche with about 50-60 pages of content bang on topic. I was wondering if Google will discredit my site in any way if I start adding pages that are** 'loosely related' **to the overall theme of my niche. Some of them are what you might call sister concepts with maybe one mention of my target keyword in the body..... Does the algo value what percentage of the whole site's content is on/ off topic? If so how important is this as a factor? Thanks a lot
Intermediate & Advanced SEO | | philipjterry0