Help, a certain directory is not being indexed
-
Before I start, dont expect this to be too easy. This really has me puzzled and am surprised I am still yet to find a solution for it. Get ready.
We have a wordpress website, launched over 6 months ago and have never had an issue getting content such as pages and post pages and categories indexed. However, I some what recently (about 2 months ago) installed a directory plugin (Business Directory Plugin) which lists businesses via unique urls that are accesible from a sub folder. Its these business listings that I absolutely cannot get indexed.
The index page to the directory which links to the business pages is indexed, however for some reason google is not indexing all the listing pages which are linked to from this page. Its not an issue of the content being uncrawlable or at least dont think so as when I run crawlers on my site such as xml sitemap crawlers it finds all the pages including the directory pages so I am sure its not an issue of the search engines not finding the content.
I have created xml sitemaps and uploaded to webmaster tools, tools recongises that there are many pages in the xml sitemap but google continues to only index a small percentage (everything but my business listings).
The directory has been there for about 8 weeks now so I know there is a issue as it should of been indexed by now.
See our main website at www.smashrepairbid.com.au and the business directory index page at www.smashrepairbid.com.au/our-shops/
To throw in a curve ball, in looking into this issue and setting up tools we noticed a lot of 404 error pages (nearly 4,000). We were very confused where these were coming from as they were only being generated from search engines - humans could not access the 404s and so we are guessing se's were firing some javascript code to generate them or something else weird. We could see the 404s in the logs so we know they were legit but again feel it was only search engines, this was validated when we added some rules to robots.txt and we saw the errors in the logs stop. We put the rules in robots txt file to try and stop google from indexing the 404 pages as we could not find anyway to fix the site / code (no idea what is causing them). If you do a site search in google you will see all the pages that are omitted in the results.
Since adding the rules to robots, our impressions shown through tools have jumped right up (increased by 5 times) so thought this was a good indication of improvement but still not getting the results we want.
Does anyone have any clue whats going on or why google and other se's are not indexing this content? Any help would be greatly appreciated and if you need any other information to assist just ask me.
Really appreciate anyone who can spare their time to help me, I sure do need it.
Thanks.
-
OK issue resolved!
Lynn thank you - was the relative url in the canonical tag that played havoc Changing it to absolute is now causing the pages to be indexed.
Lesson learnt.
-
Hey Kane,
The /shops url was a old url that had a directory in it. We blocked it in the robots as it was generating tons of 404 errors. In webmaster tools we can see thousands of 404 errors within that directory so we deleted it all and tried to block se's from throwing the errors (like i described in initial post).
A number of those listing do have very little information however there are a bunch that do have great content which is why I am not sure if that is the case. I will keep an eye on this though and also check about the logs and let you know what that says.
-
Thanks Lynn.
I have taken on your recommendation and changed the canonical tag to be absolute. Thanks for your help we will see how it goes.
-
As Lynn said, relative canonical tags could absolutely cause issues. That said, I'm seeing absolute URLs in the canonical tag now, so you may have fixed that in the past few days.
Also, I do see the Our Shops pages indexed when I search for site:smashrepairbid.com.au, but I don't see any other pages in the /our-shops/ directory aside from www.smashrepairbid.com.au/our-shops/?action=search
Your robots.txt is currently blocking /shops/. I don't think that would cause an issue but would be nice to remove that if it's not needed...
There's almost zero content on the pages I glanced at, eg. http://www.smashrepairbid.com.au/our-shops/1263/bakker-towing/ and http://www.smashrepairbid.com.au/our-shops/1616/coastal-towing-service/. When you look at it from Google's perspective, there's very little value being added by these pages. No unique photos, no phone number, no website, etc. There's a million local business scrapers that have more content than this, so why should they bother indexing these pages?
Try pulling up your logs and seeing if these URLs have been requested by Google's spiders. Here's a good guide from Ian Lurie on how to do that in Excel: http://www.portent.com/blog/analytics/how-to-read-a-web-site-log-file.htm
If the spiders are crawling those shop URLs but aren't indexing them, I think the first thing to do is add way more content to the pages.
-
Hi Trent,
Having a quick look I saw that you have relative urls in your canonical tag and this could be problematic. I think it would be worth making those urls absolute to avoid any confusion on Google's part in determining what page or page version should be indexed.
Cannot say for sure if this is the problem, but worth looking into.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index or No Index (Panda Issue)
Hi, I believe our website has been penalized by the panda update. We have over 9000 pages and we are currently indexing around 4,000 of those pages. I believe that more than half of the pages indexes have either thin content. Should we stop indexing those pages until we have quality page content? That will leave us with very few pages being indexed by Google (Roughly 1,000 of our 9,000 pages have quality content). I am worried that we would hurt our organic traffic more by not indexing the pages than by indexing the pages for google to read. Any help would be greatly appreciated. Thanks, Jim Rodriguez
On-Page Optimization | | dustyabe0 -
URL SEO: Better directory structure vs. exact keyword phrase
I am trying to understand how to best optimise a url for a page to rank high for specific keywords. Example: a top keyword search is "rental properties in new york". Question is does this keyword need to appear as this exact phrase in the url or should it be broken up into different directories for a better structure e.g.: www.abc.com/en/properties/new-york/rental OR www.abc.com/en/rental-properties-in-new-york Which will help the page rank higher (given all other things on the page are exactly the same)? Thanks!
On-Page Optimization | | MH190 -
Keyword and SERP Help Please
So I am curious about keyword placements etc. My main question is: So is whatever you search for in say Google must be the same in a website - to be found? So say you search for plumbers in Colorado Then you must have that exact, same phrase, in your website to be found? or does Google know based on title tags and such that a page is about plumbers and they service Colorado? I just want to make sure I am understanding how keywords work to be found. I mean you can have Colorado plumbers and plumbers in Colorado. So its hard to figure out how to use keywords. So a brief suggestion is greatly appreciated Chris
On-Page Optimization | | Berner0 -
Do sites built with WordPress work in China get indexed by Baidu?
I think that WordPress.com is banned, but are websites that are created by WordPress get banned?
On-Page Optimization | | CoGri0 -
Domain sub-directory not performing
We've restructured our site over the past 6 months and I'm going to run you through the whole scenario as I would love any feedback you guys have. 6 months ago we had 2 websites http://boulders-climbing.com (climbing facility) and http://bouldersuk.com (shop). We made the decision to merge the websites and leverage the SEO on 1 site. Although boulders-climbing.com was the older and more established domain, the company wanted to use bouldersuk.com so a whole new website was designed and boulders-climbing.com was redirected to bouldersuk.com. The climbing facility website now sits at bouldersuk.com and the shop was moved to bouldersuk.com/climbing-shop with 301's for all shop pages. This has lead to a significant a increase in domain rank for bouldersuk.com and much better rankings for the climbing centre related terms. The desired effect has been achieved, well half of it anyway. The search rankings for bouldersuk.com/climbing-shop have never reached the previous heights and are still heading in the wrong direction, even though the overall domain ranking has increased by 50%. What can I do to get the SEO for /climbing-shop working again? We're adding fresh content to our latest news with links through to products and categories, all category pages have A grades. we are attempting to link build but it is much more difficult for e-commerce than for the facilities pages. Is the SEO of the main site hampering(masking?) the /climbing-shop? All feedback on the whole process would be much appreciated. Thanks!
On-Page Optimization | | benj450 -
Will Google re-index the page if content is changed/improved?
Hello,
On-Page Optimization | | wickedsunny1
i have a little question, it seems most of my posts which are 3 years old or even more, all have almost same type of 2-3 lines of text at post starting and then images roundups etc.
which recently stopped getting any traffic from google. Probably becuase of lack in text etc.
So if i now edit those old pages, will they be reindexed in google with this new content/data?
thanks in advance.
cheers0 -
SEO Issues with Avactis Shopping Cart Please Help
The home page title duplicates on all cms pages. this is causing the title to be over 70 characters. This is my first experience with Avactis so I am not exactly sure how to handle. Does anyone know?
On-Page Optimization | | MACameron0 -
Is there a tool out there I could use to help me compose unique meta tags in bulk?
We have a website that has hundreds of crawl errors due to duplicate meta tags. I could do with a tool to help compose unique ones in bulk so we don't exceed the recommended character limit and follow any other best practices.
On-Page Optimization | | WebDesignBirmingham0