Non existant URLs being generated in index
-
Hi all,
I have a pretty big problem with my site at the moment which I'm worried will have an impact on my rankings.
I've just had a crawl test done and for some reason I get a load of urls returned that don't actually exist...
For example I am getting urls like this in my crawl test and xml sitemap:
All the urls seem to start off with www.applicablejobs.com/jobs/ and there is an entry for every conceivable combination of slugs.
I can only assume that if the crawl test and an xml sitemap generator is indexing these urls then Google and other search engines probably are too.
Does anyone have any idea what might be causing this issue and what can I do to remove them from Googles index if they are?
Thanks
-
Could they be archived links from years ago?
I have the same problem. Products we used to sell but either no longer sell or are out of stock (they are made inactive in the CMS and do not appear on site) show up in some google searches and in the crawl test.
Any ideas?
Cheers
Will
-
If you search for this in Goggle: site:www.applicablejobs.com
You see 43 URLs and none of the bad ones.
-
Okay. Well in that case I cannot speak to why they are happening in the first place. To keep them out of the index you could have exclude the entire /jobs/ directory using the robots.txt. If the /jobs/ directory is needed then you'll have to track down the source of the URL generation. Sorry I can be of more help.
-
Hi Stephan,
applicablejobs.com is my url yes.
-
Is your domain "www.applicablejobs.com"? If not, it sounds like you may have been hacked and someone added some code snippet to your website. I host some personal sites on Network Solutions and one day I found some strange code snippet on just about every page of the sites I run. After removing the code I had to upload every page again but only after changing all my passwords.
As for removing them? Google has a tool to remove them. However if this is not your domain - you may want to email Google and inform them of the malicious happenings.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No index and Crawl Budget
Hello, If we noindex pages, will it improve crawl budget ? For example pages like these - https://x-z.com/2012/10/
Technical SEO | | Johnroger
https://x-y.com/2012/06/
https://x-y.com/2013/03/
https://x-y.com/2019/10/
https://x-y.com/2019/08/ Should we delete/redirect such pages ? Thanks0 -
Removing a site from Google index with no index met tags
Hi there! I wanted to remove a duplicated site from the google index. I've read that you can do this by removing the URL from Google Search console and, although I can't find it in Google Search console, Google keeps on showing the site on SERPs. So I wanted to add a "no index" meta tag to the code of the site however I've only found out how to do this for individual pages, can you do the same for a entire site? How can I do it? Thank you for your help in advance! L
Technical SEO | | Chris_Wright1 -
Why do some URLs for a specific client have "/index.shtml"?
Reviewing our client's URLs for a 301 redirect strategy, we have noticed that many URLs have "/index.shtml." The part we don'd understand is these URLs aren't the homepage and they have multiple folders followed by "/index.shtml" Does anyone happen to know why this may be occurring? Is there any SEO value in keeping the "/index.shtml" in the URL?
Technical SEO | | FranFerrara0 -
Why is my office page not being indexed?
Good Morning from 24 degrees C partly cloudy wetherby UK 🙂 This page is not being indexed by Google:
Technical SEO | | Nightwing
http://www.sandersonweatherall.co.uk/office-to-let-leeds/ 1st Question Ive checked robots txt file no problems, i'm in the midst of updating the xml sitemap (it had the old one in place). It only has one link from this page http://www.sandersonweatherall.co.uk/Site-Map/ So is the reason oits not being indexed just a simple case of lack if SEO juice from inbound links so the remedy lies in routing more inbound links to the offending page? 2nd question Is the quickest way to diagnose if a web address is not being indexed to cut and paste the url in the Google search box and if it doesnt return the page theres a problem? Thanks in advance, David0 -
Ignore Urls with pattern.
I have 7000 warnings of urls because of a 302 redirect. http://imageshack.us/photo/my-images/215/44060409.png/ I want to get rid of those, is it possible to get rid of the Urls with robots.txt. For example that it does not crawl anything that has /product_compare/ in its url? Thank you
Technical SEO | | levalencia10 -
URL with tracking code
Hi there, At the company i am currently working for we have a problem with shortcut url with tracking in it. They send a lot of brochures with a shortcut URL which redirects to the page of the event with tagging. For example The real URL is:
Technical SEO | | RuudHeijnen
http://www.sbo.nl/cursussen/schoolleider-primair-onderwijs/ The URL in the brochure is:
www.sbo.nl/schoolleiderpo this then redirects to: h
ttp://www.sbo.nl/cursussen/schoolleider-primair-onderwijs/?utm_source=direct&utm_medium=shortcut&utm_campaign=schoolleiderpo Now we can measure the effect of the brochure on on-line traffic and conversion. This is great but a lot of website link to that shortcut url and if the event is put offline the links to it generate an 404. We have now about 800 backlinks that generate this 404 and i want to fix it. Another big problem "i think" is the possibility that google will index this url with tagging. Now i have 2 options: 1. look at al the url with that 404 and redirect them with a 301 to the best page 2. create the shortcut on an page that is most suitable but then i will get the tagging in the URL and i guess google will see this as dublicate content. It is possible that in the future the shortcut url will be used again. What would you suggest as the best sollution.0 -
Redesign existing websites / worried about urls / mapping
Hi Guys, While redesigning existing websites that will have page name changes such as: example.com/products to be called example.com/solutions example.com/about-us to be called example.com/about should I 301 the old url to the new url. In the past I have not done this & I'm just wondering from an SEO point of view how bad is this? (On a scale of 1 to 10 how bad is this not 301ing urls, 10 being really bad & 1 being fine), Thanks.
Technical SEO | | Socialdude0 -
Indexing of flash files
When Google indexes a flash file, do they use a library for such a purpose ? What set me thinking was this blog post ( although old ) which states - "we expanded our SWF indexing capabilities thanks to our continued collaboration with Adobe and a new library that is more robust and compatible with features supported by Flash Player 10.1."
Technical SEO | | seoug_20050