Spam URL'S in search results
-
We built a new website for a client.
When I do 'site:clientswebsite.com' in Google it shows some of the real, recently submitted pages. But it also shows many pages of spam url results, like this 'clientswebsite.com/gockumamaso/22753.htm' - all of which then go to the sites 404 page. They have page titles and meta descriptions in Chinese or Japanese too.
Some of the urls are of real pages, and link to the correct page, despite having the same Chinese page titles and descriptions in the SERPS.
When I went to remove all the spammy urls in Search Console (it only allowed me to temporarily hide them), a whole load of new ones popped up in the SERPS after a day or two. The site files itself are all fine, with no errors in the server logs.
All the usual stuff...robots.txt, sitemap etc seems ok and the proper pages have all been requested for indexing and are slowly appearing. The spammy ones continue though.
What is going on and how can I fix it?
-
Whoa, this is a weird one.
I saw that you posted this on Google's forums as well, and they suggested that this might be the Japanese keyword hack. Did you look into that? If that's not it, did you try loading the URLs that are showing up on the Wayback Machine? It's possible that someone who owned this site before your client created these pages.
Either way, the answer is to double check that your 404 pages really are 404ing. If that doesn't remove them from the index fast enough, you can actually create all of those pages, with a noindex tag, add them all to a sitemap, and submit them to Google. But the 404ing is really your long term solution.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to handle Friendly URLs together with internal filters search?
I've been trying to handle URLs from a unfrendly folder format to a semantic one, the thing is by doing so I end up with a longer URL and therefore a longer Title. Right now the format of my classified site for job seeking looks like this (folders): site.com/search/category/sales/level/supervisor/location/seattle/company/acme/published/2-days-ago/type/full-time/q/salesman/page/2 format: Filter/Content where at the end q is the query people are writting My suggestion is the following: Mixing Jobs with location, mixing category and level, and puting the rest of the filters at the end adding "--" between them. And adding 2 parameters, query (q) and pagination (pag) site.com/jobs-at-seattle/sales-supervisor/company/acme/full-time--published-2-days-ago?q=salesman&pag=2 Any thoughts on how to handle URLs over 100 chararcters and titles that go over 65?, or maybe is ok to have "friendly" long URLs and long titles when it comes to classified ad sites since they are based on internal filters to help people find what they are looking for. Sidenote: Is itok to have 2 parameters in the URL (Query and Pagination) Thanks a lot.
Technical SEO | | JoaoCJ0 -
How do I direct users to site page when they search vanity URL?
My company runs a contest via a landing page on our website. The full URL to the landing page is rather long so we have a vanity URL that we use for advertising purposes. I have a 301 on the vanity URL to the landing page URL so people visiting it directly end up where they should just fine. But if a user goes to Google and types the vanity URL into the search bar, the landing page is nowhere to be found in the results. What do I need to do to get the landing page to show in results when people search the vanity URL?
Technical SEO | | jarjarjarvis0 -
Google has deindexed 40% of my site because it's having problems crawling it
Hi Last week i got my fifth email saying 'Google can't access your site'. The first one i got in early November. Since then my site has gone from almost 80k pages indexed to less than 45k pages and the number is lowering even though we post daily about 100 new articles (it's a online newspaper). The site i'm talking about is http://www.gazetaexpress.com/ We have to deal with DDoS attacks most of the time, so our server guy has implemented a firewall to protect the site from these attacks. We suspect that it's the firewall that is blocking google bots to crawl and index our site. But then things get more interesting, some parts of the site are being crawled regularly and some others not at all. If the firewall was to stop google bots from crawling the site, why some parts of the site are being crawled with no problems and others aren't? In the screenshot attached to this post you will see how Google Webmasters is reporting these errors. In this link, it says that if 'Error' status happens again you should contact Google Webmaster support because something is preventing Google to fetch the site. I used the Feedback form in Google Webmasters to report this error about two months ago but haven't heard from them. Did i use the wrong form to contact them, if yes how can i reach them and tell about my problem? If you need more details feel free to ask. I will appreciate any help. Thank you in advance C43svbv.png?1
Technical SEO | | Bajram.Kurtishaj1 -
Parked domain is first in search results
We have several brand related domains which are parked and pointing to our main website. Some of these websites are redirecting using a 302 (don't ask, that's a whole other story), but these are being changed. But it shouldn't matter what type of redirect they are no? Since there has never been any traffic and they are not indexed? But it seems that one of them was indexed: exotravel.vn. A search for our brand name or the previous brand name (exotravel and exotissimo) brings up this parked domain first! How can that be? The domain has never been used and has no backlinks. exotravel.vn is redirecting and I submitted a change of address weeks ago to Google, but its still coming up first in all brand name searches for exotissimo or exotravel.
Technical SEO | | Exotissimo0 -
Buying multiple domains: misspells & .net, org, etc. & 301's
Hi, an SEO guy told me to buy up domains like ours X.org, net, biz, etc. & mispellings. this could cost over $100/year. Is is worth it for SEO or is it just covering our @ss if competitors want to get stupid and buy those? I don't forsee competitors doing that. What do you suggest? Does Google actually give us points for those AND if we bought them are we supposed to redirect all of them to our site? Should I be doing this for our SEO clients? Thanks.
Technical SEO | | JCunningham0 -
Does a CMS inhibit a site's crawlability?
I smell baloney but I could use a little backup from the community! My client was recently told by an SEO that search engines have a hard time getting to their site because using a CMS (like WordPress) doesn't allow "direct access to the html". Here is what they emailed my client: "Word Press (like your site is built with) and other similar “do it yourself” web builder programs and websites are not good for search engine optimization since they do not allow direct access to the HTML. Direct HTML access is needed to input important items to enhance your websites search engine visibility, performance and creditability in order to gain higher search engine rankings." Bots are blind to CMSs and html is html, correct? What do you think about the information given by the other SEO?
Technical SEO | | Adpearance0 -
Https-pages still in the SERP's
Hi all, my problem is the following: our CMS (self-developed) produces https-versions of our "normal" web pages, which means duplicate content. Our it-department put the <noindex,nofollow>on the https pages, that was like 6 weeks ago.</noindex,nofollow> I check the number of indexed pages once a week and still see a lot of these https pages in the Google index. I know that I may hit different data center and that these numbers aren't 100% valid, but still... sometimes the number of indexed https even moves up. Any ideas/suggestions? Wait for a longer time? Or take the time and go to Webmaster Tools to kick them out of the index? Another question: for a nice query, one https page ranks No. 1. If I kick the page out of the index, do you think that the http page replaces the No. 1 position? Or will the ranking be lost? (sends some nice traffic :-))... thanx in advance 😉
Technical SEO | | accessKellyOCG0 -
Will changing our colocation affect our site's link juice?
If we change our site's server location to a new IP, will this affect anything involving SEO? The site name and links will not be changing.
Technical SEO | | 9Studios0