Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google balances SERP results with positive and negative News/PR?
Hey guys, We did reputation management back in March 2017. We basically built high quality links to online assets such as linkedin, twitter, facebook, positive PR articles and other web properties in order to rank them higher then negative PR. However there was 0 change (the link building was solid). And the negative PR remains in the top 10 with also positive new articles about the site. At this point, i believe that Google is keeping the negative PR in the top 10 to keep balanced SERP results. Does anyone know if this is something Google does to balance positive and negative results? Cheers, Chris
Intermediate & Advanced SEO | | cathywix0 -
Should I redirect off topic blog posts?
We launched a store on top of a popular blog. The blog had nothing to do with the store. The blog has a lot of backlinks and traffic, but our store is now our primary business. I am concerned that the off topic blog content may be affecting or ability to rank better for the core store business. Should we delete or redirect the old blog content to another website to improve the SEO for our store?
Intermediate & Advanced SEO | | seo-mojo1 -
Robots.txt, Disallow & Indexed-Pages..
Hi guys, hope you're well. I have a problem with my new website. I have 3 pages with the same content: http://example.examples.com/brand/brand1 (good page) http://example.examples.com/brand/brand1?show=false http://example.examples.com/brand/brand1?show=true The good page has rel=canonical & it is the only page should be appear in Search results but Google has indexed 3 pages... I don't know how should do now, but, i am thinking 2 posibilites: Remove filters (true, false) and leave only the good page and show 404 page for others pages. Update robots.txt with disallow for these parameters & remove those URL's manually Thank you so much!
Intermediate & Advanced SEO | | thekiller990 -
Google Indexing of Images
Our site is experiencing an issue with indexation of images. The site is real estate oriented. It has 238 listings with about 1190 images. The site submits two version (different sizes) of each image to Google, so there are about 2,400 images. Only several hundred are indexed. Can adding Microdata improve the indexation of the images? Our site map is submitting images that are on no-index listing pages to Google. As a result more than 2000 images have been submitted but only a few hundred have been indexed. How should the site map deal with images that reside on no-index pages? Do images that are part of pages that are set up as "no-index" need a special "no-index" label or special treatment? My concern is that so many images that not indexed could be a red flag showing poor quality content to Google. Is it worth investing in correcting this issue, or will correcting it result in little to no improvement in SEO? Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Wrong Page Indexing in SERPS - Suggestions?
Hey Moz'ers! I have a quick question. Our company (Savvy Panda) is working on ranking for the keyword: "Milwaukee SEO". On our website, we have a page for "Milwaukee SEO" in our services section that's optimized for the keyword and we've been doing link building to this. However, when you search for "Milwaukee SEO" a different page is being displayed in the SERP's. The page that's showing up in the SERP's is a category view of our blog of articles with the tag "Milwaukee SEO". **Is there a way to alert google that the page showing up in the SERP's is not the most relevant and request a new URL to be indexed for that spot? ** I saw a webinar awhile back that showed something like that using google webmaster sitelinks denote tool. I would hate to denote that URL and then loose any kind of indexing for the keyword.
Intermediate & Advanced SEO | | SavvyPanda
Ideas, suggestions?0 -
Indexing an e-commerce site
Hi all, My client babyblingstreet.com. She sells baby and toddler clothing. Now a lot of the links on her site contain the same products. For instance: if you go to "What's new" you can find those same products in let's say her "Sale Items" link category. The real problem with this is let's say my client sells a green dress and someone accesses it through the "baby and toddler dresses" category. And let's say this URL has 10 links pointing to it. Now, let's say someone else accesses this same green dress through the "What's new" category. And let's say this particular URL has 10 links pointing to it. Instead of having 20 links pointing to one URL about the green dress, I now have 10 links pointing to one URL and 10 pointing to another URL even though both URLs feature the exact same green dress. In this particular example I would want to make the URL of the green dress in the "baby and toddler clothing" section be the canonical URL. So that means I would have to use this canonical tag on the green dress URL that's in the "what's new" category and let's say also the "sale items" category. This could get very tedious if my client has 200+ products. So I am wondering if I have to place a canonical tag on every URL that displays the green dress? More importantly, I would like to know other people's strategies for indexing e-commerce sites that have the same product featured in multiple categories throughout the site. I hope this makes sense. Thanks for your time.
Intermediate & Advanced SEO | | jenga110 -
Ranking with other pages not index
The site ranks on page 4-5 with other page like privacy, about us, term pages. I encounter this problem allot in the last weeks; this usually occurs after the page sits 1-2 months on page 1 for the terms. I'm thinking of to much use the same anchor as a primary issue. The sites in questions are 1-5 pages microniche sites. Any suggestions is appreciated. Thank You
Intermediate & Advanced SEO | | m3fan0 -
Redirects/Forwarding
I have two niche e-commerce sites. One is a PR3 with 3K pages indexed, the other is PR0 with 5K pages indexed. Each site has a blog that has been updated regularly. They both rank well for some fairly competitive keywords and some good links pointing to them. I also have a main site that is PR3. I am thinking of closing down the sites because they are not generating enough revenue, here are my questions: What is the best way to get the most SEO value from these sites? Do I just do a redirect to the main site? Should I keep the sites and use canonical URLs to the main site? Should I keep the domain as a wordpress blog and point links to the main site? What should I do with the blogs? They are on sub-domains, neither has pagerank. Thanks
Intermediate & Advanced SEO | | inhouseseo0