Why are crawlers not picking up these pages?
-
Hi there,
I've been asked to audit a new subdomain for a travel company. It's all a bit messy, so it's going to take some time to remedy. However, one thing I couldn't understand was the low number of pages appearing in certain crawlers.
The subdomain has many pages. A homepage, category pages then product pages. Unfortunately, tools like Screaming Frog and xml-sitemaps.com are only picking up 19 pages and I can't figure out why. Google has so far indexed around 90 pages - this is by no means all of them, but that's probably because of the new domain and lack of sitemap etc.
After looking at the crawl results, only the homepage and category (continent pages) are showing. So all the product pages are not. for example, tours.statravel.co.uk/trip/Amsterdam_Kings_Day_(Start_London_end_London)-COCCKDM11 is not appearing in the crawl results. After reviewing the source code, I can't see anything that would prevent this page being crawled. Am I missing something?
At the moment, the crawl should be picking up around 400+ product pages, but it's not picking up any.
Thanks
-
Hi,
I would think it is the javascript being used on the pages (google can theoretically render the page as a browser would, screaming frog and other similar tools on the whole cannot). If you visit the homepage with js turned off then you see a pretty empty page with a list of links (region, activity, country) which are the same links that screaming frog is picking up. If you go into one of the search results pages with js turned off, you don't really see much of anything at all. Google is obviously doing a better job of crawling the js content! A solution would be to present the data in a simpler, crawlable format for non js enabled browsers but that is (probably a big) conversation with your developers
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why would Google not index all submitted pages?
On Google Search console we see that many of our submitted pages weren't indexed. What could be the reasons? | Web pages |
Technical SEO | | Leagoldberger
| 130,030 Submitted |
| 87,462 Indexed |0 -
Removing indexed pages
Hi all, this is my first post so be kind 🙂 - I have a one page Wordpress site that has the Yoast plugin installed. Unfortunately, when I first submitted the site's XML sitemap to the Google Search Console, I didn't check the Yoast settings and it submitted some example files from a theme demo I was using. These got indexed, which is a pain, so now I am trying to remove them. Originally I did a bunch of 301's but that didn't remove them from (at least not after about a month) - so now I have set up 410's - These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?
Technical SEO | | Jettynz
Thanks in advance for any suggestions.0 -
"One Page With Two Links To Same Page; We Counted The First Link" Is this true?
I read this to day http://searchengineland.com/googles-matt-cutts-one-page-two-links-page-counted-first-link-192718 I thought to myself, yep, thats what I been reading in Moz for years ( pitty Matt could not confirm that still the case for 2014) But reading though the comments Michael Martinez of http://www.seo-theory.com/ pointed out that Mat says "...the last time I checked, was 2009, and back then -- uh, we might, for example, only have selected one of the links from a given page."
Technical SEO | | PaddyDisplays
Which would imply that is does not not mean it always the first link. Michael goes on to say "Back in 2008 when Rand WRONGLY claimed that Google was only counting the first link (I shared results of a test where it passed anchor text from TWO links on the same page)" then goes on to say " In practice the search engine sometimes skipped over links and took anchor text from a second or third link down the page." For me this is significant. I know people that have had "SEO experts" recommend that they should have a blog attached to there e-commence site and post blog posts (with no real interest for readers) with anchor text links to you landing pages. I thought that posting blog post just for anchor text link was a waste of time if you are already linking to the landing page with in a main navigation as google would see that link first. But if Michael is correct then these type of blog posts anchor text link blog posts would have value But who is' right Rand or Michael?0 -
Redirecting Several Hundred Pages
As of May 21st 2013 (Penguin 2.0 update) we hit a triple-header and I think we can now officially dubbed the "KING OF GOOGLE PANALTIES"! 😞 -July 2012 - recieved 2 "Unatural Links" email -April 2012 - 20% traffic hit -May 21st 2013 - 35% traffic hit We have/had lots of very low quality links using the same anchor text as well as about 150 very low quality articles and almost 100 categories w/several hundred products that recieved little to no traffic. We have spent the last several weeks cleaning up our link profile and were highly successful in getting most of them removed and have kept detailed reports for our Reconsideration Request for the manual "Unatural Links" penalty. We have also went a step further and have completely redesigned the site that is now much faster/better on-page seo with new, high quality articles and are removing all the low quality articles, categories and products but we are unclear what to do with these. Which brings me to my question. Should we redirect these pages back to the home page or just let them go to 404 error? I have been doing lots of reading on this subject but there doesnt seem to be any good answers. From what I read, neither are good choices and I cannot decide between the lesser of the 2 evil's ..so any help with this would be greatly apreciated! Note:
Technical SEO | | k9byron
-These category and product pages have absolutly no inbound links (link benefit) and in my opinion are only sucking off link juice and generating little to no revenue. There are also no similar categories or products that these could be redirected to. For example, redirecting dog toys to the dog bed category just sounds like it would increase our bounce rate. -Again, the articles also have no link benefit and only a small handful of the articles actually generate any traffic to speak of (several thousand visitors per year) and the rest generate less than 1000 visitors per year. All have high bounce rates and low conversions. It would be nice to keep them live as I think some are okay and could be rewritten/re-purpose over time but maybe in light of our Panda penalty it might be better to just to save them offline, let them go to 404 errors and rewritten/re-purpose them another time? -We did create a very nice 404 page with category navigation and huge search bar so I am leaning more toward this option.
..
Thank in Advance!0 -
Limit for words on a page?
Some SEOs might say that there is no such thing as too much good content. Do you try to limit the words on a page to under a certain number for any reason?
Technical SEO | | Charlessipe0 -
Google picking up wrong page title
Hi, When searching for "Tottenham Forum" on google.co.uk (link below) http://www.google.co.uk/search?q=tottenham+forum&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-GB:official&client=firefox-a The site I manage (THFCTalk.com) is listed as 4th in the search results, but was hacked a few months ago and the search results lists the page title as "Free Shipping. Order Cialis Online. - Online Pharmacy" when the actual page title of THFCTalk is not actually set at that. Any idea how to fix this so Google updates this header on the search results? - as it is surely putting people off from clicking on our search result
Technical SEO | | WalesDragon0 -
No index directory pages?
All, I have a site built on WordPress with directory software (edirectory) on the backend that houses a directory of members. The Wordpress portion of the site is full of content and drives traffic through to the directory. Like most directories, the results pages are thin on content and mainly contain links to member profiles. Is it best to simply no index the search results for the directory portion of the site?
Technical SEO | | JSOC0 -
I have 15,000 pages. How do I have the Google bot crawl all the pages?
I have 15,000 pages. How do I have the Google bot crawl all the pages? My site is 7 years old. But there are only about 3,500 pages being crawled.
Technical SEO | | Ishimoto0