Page not being indexed or crawled and no idea why!
-
Hi everyone,
There are a few pages on our website that aren't being indexed right now on Google and I'm not quite sure why. A little background:
We are an IT training and management training company and we have locations/classrooms around the US. To better our search rankings and overall visibility, we made some changes to the on page content, URL structure, etc. Let's take our Washington DC location for example. The old address was:
http://www2.learningtree.com/htfu/location.aspx?id=uswd44
And the new one is:
http://www2.learningtree.com/htfu/uswd44/reston/it-and-management-training
All of the SEO changes aren't live yet, so just bear with me. My question really regards why the first URL is still being indexed and crawled and showing fine in the search results and the second one (which we want to show) is not. Changes have been live for around a month now - plenty of time to at least be indexed.
In fact, we don't want the first URL to be showing anymore, we'd like the second URL type to be showing across the board. Also, when I type into Google site:http://www2.learningtree.com/htfu/uswd44/reston/it-and-management-training I'm getting a message that Google can't read the page because of the robots.txt file. But, we have no robots.txt file. I've been told by our web guys that the two pages are exactly the same. I was also told that we've put in an order to have all those old links 301 redirected to the new ones. But still, I'm perplexed as to why these pages are not being indexed or crawled - even manually submitted it into Webmaster tools.
So, why is Google still recognizing the old URLs and why are they still showing in the index/search results?
And, why is Google saying "A description for this result is not available because of this site's robots.txt"
Thanks in advance!
- Pedram
-
Hi Mike,
Thanks for the reply. I'm out of the country right now, so reply might be somewhat slow.
Yes, we have links to the pages on our sitemaps and I have done fetch requests. I did a check now and it seems that the niched "New York" page is being crawled now. Might have been a time issue as you suggested. But, our DC page still isn't being crawled. I'll check up on it periodically and see the progress. I really appreciate your suggestions - it's already helping. Thank you!
-
It possibly just hasn't been long enough for the spiders to re-crawl everything yet. Have you done a fetch request in Webmaster Tools for the page and/or site to see if you can jumpstart things a little? Its also possible that the spiders haven't found a path to it yet. Do you have enough (or any) pages linking into that second page that isn't being indexed yet?
-
Hi Mike,
As a follow up, I forwarded your suggestions to our Webmasters. The adjusted the robots.txt and now reads this, which I think still might cause issues and am not 100% sure why this is:
User-agent: * Allow: /htfu/ Disallow: /htfu/app_data/ Disallow: /htfu/bin/ Disallow: /htfu/PrecompiledApp.config Disallow: /htfu/web.config Disallow: / Now, this page is being indexed: http://www2.learningtree.com/htfu/uswd74/alexandria/it-and-management-training But, a more niched page still isn't being indexed: http://www2.learningtree.com/htfu/usny27/new-york/sharepoint-training Suggestions?
-
The pages in question don't have any Meta Robots Tags on them. So once the Disallow in Robots.txt is gone and you do a fetch request in Webmaster Tools, the page should get crawled and indexed fine. If you don't have a Meta Robots Tag, the spiders consider it Index,Follow. Personally I prefer to include the index, follow tag anyway even if it isn't 100% necessary.
-
Thanks, Mike. That was incredibly helpful. See, I did click the link on the SERP when I did the "site" search on Google, but I was thinking it was a mistake. Are you able to see the disallow robot on the source code?
-
Your Robots.txt (which can be found at http://www2.learningtree.com/robots.txt) does in fact have Disallow: /htfu/ which would be blocking http://www2.learningtree.com**/htfu/**uswd44/reston/it-and-management-training from being crawled. While your old page is also technically blocked, it has been around longer and would already have been cached so will still appear in the SERPs.... the bots just won't be able to see changes made to it because they can't crawl it.
You need to fix the disallow so the bots can crawl your site correctly and you should 301 your old page to the new one.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page drops from index completely
We have a page that is ranking organically at #1 but over the past couple of months the page has twice dropped from a search term entirely. There don't appear to be any issues with the page in Search Console and adding the page on https://www.google.com/webmasters/tools/submit-url seems to fix the issue. The search term we're tracking that drops is in the URL for the page and is the h1 of the page. Here is a screenshot of the ranking over the past few months: https://jmp.sh/akvaKGF What could cause this to happen? There is nothing in search console that shows any problems with the page. The last time this happened the page completely dropped on all search terms and showed up again after submitting the url to google manually. This time it dropped on just one search term and reappeared the next day after manually submitting the page again. akvaKGF
White Hat / Black Hat SEO | | russell_ms0 -
Link with Anchor to header of the page: Keyword is ranking
I saw something interesting this week. I am doing research and spec-ing out a content page we are creating and one of our competitors "office Depot" on their phone repair page create exact match keywords that lead to an anchor that took you to the header of that pages. They were ranking first for all of those keywords with little to no links Thier strategy is the more local long tail that includes "near me" Have you guys ever seen this
White Hat / Black Hat SEO | | uBreakiFix
this is the URL: https://www.officedepot.com/a/content/customer-service/samedayrepair/ They are ranking for these keywords ( Top 3 nationally ) iphone 6 repair near me iphone 7 repair near me I am assuming that this is both due to their PA and DA authority shifting the authority to itself, but it does not make sense how they are lacking in a lot of SEO low-hanging fruits like H1/H2 keyword saturation, URL, Title Tag within this content page....Anyone up for discussing this?0 -
One page sites
HI Guys, I need help with a one page site What is the best method to getting the lower pages indexed? Linking back to the site(Deeplinking) is looking impossible. Will this hurt my SEO? Are there any other tips on one page websites that you can recommend?
White Hat / Black Hat SEO | | Johnny_AppleSeed0 -
Rank product pages
What are the best ways to rank your product pages, We have a few ecommerce sites and we want to increase the position of both our product and catagory pages. I know that gaining more popularity will help to increase the DA but I want my product pages to rank higher.
White Hat / Black Hat SEO | | Johnny_AppleSeed0 -
Internal Links to Ecommerce Category Pages
Hello, I read a while back, and I can't find it now, that you want to add internal links to your main category pages. Does that still apply? If so, for a small site (100 products) what is recommended? Thanks
White Hat / Black Hat SEO | | BobGW0 -
Are landing pages making a comeback
Just recently I have noticed a ever increasing number of landing pages on websites, the ones I have come across have been in the sports industry like rugby/football and their landing pages are sparse but offering the social avenues on a plate. are Landing pages making their way back in the seo industry?
White Hat / Black Hat SEO | | TeamacPaints0 -
Is widget linkbaiting a bad idea now that webmasters are getting warnings of unnatural links?
I was reading this article about how many websites are being deindexed because of an unnatural linking profile and it got me thinking about some widgets that I have created. In the example given, a site was totally deindexed and the author believes the reason was because of multiple footer links from themes that they created. I have one site that has a very popular widget that I offer to others to embed into their site. The embed code contains a line that says, "Tool provided by Site Name". Now, it just so happens that my site name contains my main keyword. So, if I have hundreds of websites using this tool and linking back to me using the same anchor text, could Google see this as unnatural and possibly deindex me? I have a few thoughts on what I should do but would love to hear your thoughts: 1. I could use a php script to provide one of several different anchor text options when giving my embed code. 2. I could change the embed code so that the anchor text is simply my domain name, ie www.mywebsitename.com rather than "my website name". Thoughts?
White Hat / Black Hat SEO | | MarieHaynes1 -
Influence of users' comments on a page (on-page SEO)
Do you think when Google crawls your page, it "monitors" comments updates to use this as a ranking factor? If Google is looking for social signs, looking for comments updates might be a social sign as well (ok a lot easier to manipulate, but still social). thx
White Hat / Black Hat SEO | | gt30