Help, a certain directory is not being indexed
-
Before I start, dont expect this to be too easy. This really has me puzzled and am surprised I am still yet to find a solution for it. Get ready.
We have a wordpress website, launched over 6 months ago and have never had an issue getting content such as pages and post pages and categories indexed. However, I some what recently (about 2 months ago) installed a directory plugin (Business Directory Plugin) which lists businesses via unique urls that are accesible from a sub folder. Its these business listings that I absolutely cannot get indexed.
The index page to the directory which links to the business pages is indexed, however for some reason google is not indexing all the listing pages which are linked to from this page. Its not an issue of the content being uncrawlable or at least dont think so as when I run crawlers on my site such as xml sitemap crawlers it finds all the pages including the directory pages so I am sure its not an issue of the search engines not finding the content.
I have created xml sitemaps and uploaded to webmaster tools, tools recongises that there are many pages in the xml sitemap but google continues to only index a small percentage (everything but my business listings).
The directory has been there for about 8 weeks now so I know there is a issue as it should of been indexed by now.
See our main website at www.smashrepairbid.com.au and the business directory index page at www.smashrepairbid.com.au/our-shops/
To throw in a curve ball, in looking into this issue and setting up tools we noticed a lot of 404 error pages (nearly 4,000). We were very confused where these were coming from as they were only being generated from search engines - humans could not access the 404s and so we are guessing se's were firing some javascript code to generate them or something else weird. We could see the 404s in the logs so we know they were legit but again feel it was only search engines, this was validated when we added some rules to robots.txt and we saw the errors in the logs stop. We put the rules in robots txt file to try and stop google from indexing the 404 pages as we could not find anyway to fix the site / code (no idea what is causing them). If you do a site search in google you will see all the pages that are omitted in the results.
Since adding the rules to robots, our impressions shown through tools have jumped right up (increased by 5 times) so thought this was a good indication of improvement but still not getting the results we want.
Does anyone have any clue whats going on or why google and other se's are not indexing this content? Any help would be greatly appreciated and if you need any other information to assist just ask me.
Really appreciate anyone who can spare their time to help me, I sure do need it.
Thanks.
-
OK issue resolved!
Lynn thank you - was the relative url in the canonical tag that played havoc Changing it to absolute is now causing the pages to be indexed.
Lesson learnt.
-
Hey Kane,
The /shops url was a old url that had a directory in it. We blocked it in the robots as it was generating tons of 404 errors. In webmaster tools we can see thousands of 404 errors within that directory so we deleted it all and tried to block se's from throwing the errors (like i described in initial post).
A number of those listing do have very little information however there are a bunch that do have great content which is why I am not sure if that is the case. I will keep an eye on this though and also check about the logs and let you know what that says.
-
Thanks Lynn.
I have taken on your recommendation and changed the canonical tag to be absolute. Thanks for your help we will see how it goes.
-
As Lynn said, relative canonical tags could absolutely cause issues. That said, I'm seeing absolute URLs in the canonical tag now, so you may have fixed that in the past few days.
Also, I do see the Our Shops pages indexed when I search for site:smashrepairbid.com.au, but I don't see any other pages in the /our-shops/ directory aside from www.smashrepairbid.com.au/our-shops/?action=search
Your robots.txt is currently blocking /shops/. I don't think that would cause an issue but would be nice to remove that if it's not needed...
There's almost zero content on the pages I glanced at, eg. http://www.smashrepairbid.com.au/our-shops/1263/bakker-towing/ and http://www.smashrepairbid.com.au/our-shops/1616/coastal-towing-service/. When you look at it from Google's perspective, there's very little value being added by these pages. No unique photos, no phone number, no website, etc. There's a million local business scrapers that have more content than this, so why should they bother indexing these pages?
Try pulling up your logs and seeing if these URLs have been requested by Google's spiders. Here's a good guide from Ian Lurie on how to do that in Excel: http://www.portent.com/blog/analytics/how-to-read-a-web-site-log-file.htm
If the spiders are crawling those shop URLs but aren't indexing them, I think the first thing to do is add way more content to the pages.
-
Hi Trent,
Having a quick look I saw that you have relative urls in your canonical tag and this could be problematic. I think it would be worth making those urls absolute to avoid any confusion on Google's part in determining what page or page version should be indexed.
Cannot say for sure if this is the problem, but worth looking into.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can service request pages be indexed for a service site?
I think there is no point in indexing service request pages for a service site. And it causes the indexing of the main pages to be done with a delay. Does anyone have experience with indexing service request pages and their results?
On-Page Optimization | | sora.ya04680 -
Wordpress 'Hide Title' Feature, does this help shorten title length
Im wondering if anyone with some Wordpress experience can help me. I am using Yoast to create my page titles, but yet Moz tells me that my page titles including my actual page title tag which is 'dumfries wedding photography | Hemera Visuals' by clicking on the 'hide title' feature in wordpress will this in turn stop wordpress from automatically adding my page title and therfor bring my title length down drastically? And if so will I have to wait till google next crawls my page to see if this works? Kind Regards Cameron.
On-Page Optimization | | hemeravisuals120 -
Google index text that I can not find
Hello everybody, As you can see here: http://webcache.googleusercontent.com/search?q=cache:G-iicHoDJeYJ:www.billigste-internet.dk/&hl=da&gl=dk&strip=1 Google index the text "Forside" as the H1 tag, and "Right" and "Left" as body text, on my website. But I do not want that Google indexes this. But when I look in mine source code (see here: view-source:http://www.billigste-internet.dk/) I can not find "Forside", "rigth" or "Left", so I can delete it. Is there anyone who can help me where I need to delete the text "Forside", "Right" and "Left", so Google does not index this text? Hope someone can help.
On-Page Optimization | | JoLindahl910 -
Index Page Content
Mozers, I am of the believe and as a person who puts the utmost emphasis on the index page of any website I am trying to rank, especially with a new domain ... insuring content is relevant, structured, optimized and we have some link juice flowing in. I find once we get the index page ranked, Google's little bots then start to index and rank accordingly the rest of the website ... and we start producing results. We also develop websites (dare I say its where we expertise in) and unexpectantly the client has asked us to carry out SEO work additionally to their web development. Problem lies here, their index page, has absolutely no written content at all, just one large image with a logo (Fashion Website) ...Which I identify as a huge issue as per my explanation is paragraphs one or two. I am sure withe the many more qualified SEO experts and gurus within the SEOmoz community, you have also come across this issue So a few questions, if you don't mind adding advice. 1 - Am I putting too much emphasize on content within the index page, in terms of indexing and actually ranking ...yes I appreciate that terms within the website will be ranked against other pages other than the index page, but will it harm us for having no content at all within the index page 2 - If so, and yes is the answer to above, how do we handle it, we have spoke with the client and he is pretty adamant that he want the index page as is, he has been through out the whole website building process. As suggested, any advice would be really appreciated, its a difficult market to rank within a it is, and i can only see this index page making the task a lot more difficult Cheers John
On-Page Optimization | | Johnny4B0 -
Why will google not index my Images
Hi I've added to index images in our sitemap although they are showing as being submitted Google hasn't indexed a single one. This has been the case for about 3 months. Is there any reason why Google would not index them? Thanks
On-Page Optimization | | tidybooks0 -
Google indexing page differently
Does google index an interal page differently depending on whether you are using a FULL url (including domain) or just a relative link? Also, is it possible that using a full URL (http://mysite.com/page.html) causes the browser to "ping" the server differently than just having the href linked to using relative links (/page.html) Could this cause server or firewall perfomance issues?
On-Page Optimization | | WebRiverGroup0 -
Canonical Help?
This canonical thing is brand new to me and I'm trying to wrap my mind around it. Here is my situation: I use Wordpress. I am showing duplicate content with the following url's http://crosstrainingandfitness.com/online-workout-blog/ http://crosstrainingandfitness.com/online-workout-blog/page/2/ Would setting a canonical link solve this? If so, what do I put in the Canonical box for this category (online workout blog). I use Yoast's Wordpress SEO plugin. Any help is greatly appreciated.
On-Page Optimization | | carbbon0 -
Help With Disappearing Rankings
Hi Guys, I am stumped!!. I have been asked to look at this site http://www.quarrymotors.co.uk. Which has lost rankings for "BMW Parts" since a redesign of the site. Through a bit of detective work I have managed to get hold ot the wayback machine version of the old site here http://web.archive.org/web/20080520104847/http://www.quarrymotors.co.uk/ And according to the onpage factors I have compared with SEOquake, the keyword density, title tag, description is almost identical. I have checked webmaster tools and analytics (I only have data from June 22nd) So I am unable to confirm what traffic was available before the redesign. All other keywords:- BMW Breakers BMW Spares Are on the first page but this keyword "BMW Parts" is on page 22??!?. I have checked open site explorer and it's not a case over optimisation of anchor text as a majority of the keywords are pointing back to the url and it's one of the cleanest profiles I have ever seen. The only issue, which it can't be is right at the top left corner of the site is a piece of text "Used BMW Parts & Spares" Any help would be gracefully appreciated Kind Regards Neil
On-Page Optimization | | nezona0