Blog page won't get indexed
-
Hi Guys,
I'm currently asked to work on a website. I noticed that the blog posts won't get indexed in Google. www.domain.com/blog does get indexed but the blogposts itself won't. They have been online for over 2 months now.
I found this in the robots.txt file:
Allow: / Disallow: /kitchenhandle/ Disallow: /blog/comments/ Disallow: /blog/author/ Disallow: /blog/homepage/feed/
I'm guessing that the last line causes this issue. Does anyone have an idea if this is the case and why they would include this in the robots.txt?
Cheers!
-
Thanks alot!
-
Hi Dirk,
Good observation, I missed the canonical part somehow. So, google is indexing the canonical URLs here which doesn't have /blog/ in it and that's the problem. Have a look at the indexed page for this particular instance here. Non /blog/ instance is indexed, which will take you to its /blog/ version with wrong canonical URL.
Solution: Either remove the canonical URLs on these pages to point them to the current page itself. And yeah! As rightly mentioned by Dirk, do a proper /blog/ page linking from the blog page and other pages from where you're linking these articles.
-
This is definitely the issue. Fix that canonical and they'll be indexed.
-
To update - even worse: on the blog itself you are linking to the canonical version - not to the /blog/ version. So it would be impossible for Google to index /blog/ type of content.
If you do woontrends 2016 site:www.keukensduitsland.nl you will notice that the canonical version is properly indexed (even with the strange js redirect.
Dirk
-
It's not related to the robots.txt - you can easily check that in Webmastertools (Crawl > Robots.txt tester)
First issue is the location of the link - if you put a small link to the blog hidden in the left corner at the bottom of the page Google is not going to attribute a lot of importance to this link.
Most important issue on your blog articles is the canonical - example:
http://www.keukensduitsland.nl/blog/woontrends-2016/ has as canonical url: http://www.keukensduitsland.nl/woontrends-2016/ - however this page will redirect you with javascript to the blog article.
Make the canonical self referencing and do a proper redirect on the other pages (301 rather than js redirect)
Dirk
-
Hi Happy SEO,
Well, the robots.txt looks find here. Could you try to fetch any of the blog page/post as google in the search console and share the screenshot here?
Also, to cross check the robots.txt (which looks fine though), you have robots.txt tester in search console where you can put any blog page/post to check if bots can crawl it. Please share a screenshot of that as well.
On a separate note, the sitemap.xml link mentioned in the robots.txt (http://www.keukensduitsland.nl/sitemap.xml) is broken. Fix that as well.
-
Hi Nitin,
The URL is www.keukensduitsland.nl (/blog). The link to the blog page is in the bottom left corner called "Keukennieuws".
-
Hi Happy SEO,
Could you please share the blog URL here? Sounds like an interesting issue and would love to give a try to help you with this
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing indexed pages
Hi all, this is my first post so be kind 🙂 - I have a one page Wordpress site that has the Yoast plugin installed. Unfortunately, when I first submitted the site's XML sitemap to the Google Search Console, I didn't check the Yoast settings and it submitted some example files from a theme demo I was using. These got indexed, which is a pain, so now I am trying to remove them. Originally I did a bunch of 301's but that didn't remove them from (at least not after about a month) - so now I have set up 410's - These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?
Technical SEO | | Jettynz
Thanks in advance for any suggestions.0 -
How to block text on a page to be indexed?
I would like to block the spider indexing a block of text inside a page , however I do not want to block the whole page with, for example , a noindex tag. I have tried already with a tag like this : chocolate pudding chocolate pudding However this is not working for my case, a travel related website. thanks in advance for your support. Best regards Gianluca
Technical SEO | | CharmingGuy0 -
Why Custom Post Types Don't Get Ranked Well
So I have started to use custom post types and have noticed that they don't get ranked very well in Google. On the other hand a post will rank almost immediately. Does anyone know any tricks for making custom post types rank as well as posts in wordpress? Sincerely, Garret
Technical SEO | | eWebify0 -
Results pages are not getting pagerank
Hello there, I have a website with a PR5 and seo "juice" is passing down smoothly except for results pages (sorry french ) : http://homengo.com/comment-ca-marche/presentation/ is getting a PR http://homengo.com/s/vente/paris_dept-75/ is not The same goes for all results pages which could indicate a problem. Is there something wrong with these pages, i can not figure it out, or do you have some tools which could help identify the trouble ? Thanks a lot
Technical SEO | | seomengo0 -
Unnecessary pages getting indexed in Google for my blog
I have a blog dapazze.com and I am suffering from a problem for a long time. I found out that Google have indexed hundreds of replytocom links and images attachment pages for my blog. I had to remove these pages manually using the URL removal tool. I had used "Disallow: ?replytocom" in my robots.txt, but Google disobeyed it. After that, I removed the parameter from my blog completely using the SEO by Yoast plugin. But now I see that Google has again started indexing these links even after they are not present in my blog (I use #comment). Google have also indexed many of my admin and plugin pages, whereas they are disallowed in my robots.txt file. Have a look at my robots.txt file here: http://dapazze.com/robots.txt Please help me out to solve this problem permanently?
Technical SEO | | rahulchowdhury0 -
Huge number of indexed pages with no content
Hi, We have accidentally had Google indexed lots os our pages with no useful content at all on them. The site in question is a directory site, where we have tags and we have cities. Some cities have suppliers for almost all the tags, but there are lots of cities, where we have suppliers for only a handful of tags. The problem occured, when we created a page for each cities, where we list the tags as links. Unfortunately, our programmer listed all the tags, so not only the ones, where we have businesses, offering their services, but all of them! We have 3,142 cities and 542 tags. I guess, that you can imagine the problem this caused! Now I know, that Google might simply ignore these empty pages and not crawl them again, but when I check a city (city site:domain) with only 40 providers, I still have 1,050 pages indexed. (Yes, we have some issues between the 550 and the 1050 as well, but first things first:)) These pages might not be crawled again, but will be clicked, and bounces and the whole user experience in itself will be terrible. My idea is, that I might use meta noindex for all of these empty pages and perhaps also have a 301 redirect from all the empty category pages, directly to the main page of the given city. Can this work the way I imagine? Any better solution to cut this really bad nightmare short? Thank you in advance. Andras
Technical SEO | | Dilbak0 -
Getting images indexed in the SERPS
Good Afternoon form 13 degrees C totally Sunny Wetherby UK 🙂 Am i right in thinking that the only way to get images appearing like this in your serps: http://i216.photobucket.com/albums/cc53/zymurgy_bucket/innovia-merchant-immages-serpscopy.jpg is to be hooked up to Google Merchant? Which kind of means if the sight your working on has no images then this type of enhancement is out of bounds? Thanks in advance, David
Technical SEO | | Nightwing0 -
Getting home page content at top of what robots see
When I click on the text-only cache of nlpca(dot)com on the home page http://webcache.googleusercontent.com/search?q=cache:UIJER7OJFzYJ:www.nlpca.com/&hl=en&gl=us&strip=1 our H1 and body content are at the very bottom. How do we get the h1 and content at the top of what the robots see? Thanks!
Technical SEO | | BobGW0