Why the number of crawled pages is so low¿?
-
Hi, my website is www.theprinterdepo.com and I have been in seomoz pro for 2 months.
When it started it crawled 10000 pages, then I modified robots.txt to disallow some specific parameters in the pages to be crawled.
We have about 3500 products, so thhe number of crawled pages should be close to that number
In the last crawl, it shows only 1700, What should I do?
-
Hi levelencia1,
This could have been caused by many factors. Was the robots.txt the only change you made? Other things that could have caused it could have been meta "noindex" tags, nofollow links, or broken navigation structures.
In rare instances, sometimes rogerbot has a hiccup.
Let us know if things return to normal on your next crawl. If you have any difficulties feel free to contact the help team (help@seomoz.org) and they should be able to get things straightened out.
Best of luck with your SEO!
-
levalencia1
Still don't know what you wanted to accomplish with Robots re: I modified robots.txt to disallow some specific parameters in the pages to be crawled.
Go to GWMT: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449&from=35237&rd=1
This will allow you to determine what your robots.txt accomplished or not:
The Test robots.txt tool will show you if your robots.txt file is accidentally blocking Googlebot from a file or directory on your site, or if it's permitting Googlebot to crawl files that should not appear on the web. When you enter the text of a proposed robots.txt file, the tool reads it in the same way Googlebot does, and lists the effects of the file and any problems found.
Hope it helps you out,
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
ANY IDEA?
-
this is my robots.txt
User-agent: * Disallow: */product_compare/* Disallow: *dir=* Disallow: *order=*
-
levalencia1
What did you disallow?
Are there specific categories or products you know are missing?
Is there a specific sub directory(s) that is missing?
What is it you wanted to block with robots?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is my page being indexed?
To put you all in context, here is the situation, I have pages that are only accessible via an intern search tool that shows the best results for the request. Let's say i want to see the result on page 2, the page 2 will have a request in the url like this: ?p=2&s=12&lang=1&seed=3688 The situation is that we've disallowed every URL's that contains a "?" in the robots.txt file which means that Google doesn't crawl the page 2,3,4 and so on. If a page is only accessible via page 2, do you think Google will be able to access it? The url of the page is included in the sitemap. Thank you in advance for the help!
Technical SEO | | alexrbrg0 -
Pages Not Getting Indexed
Hey there I have a website with pretty much 3-4 pages. All of them had a canonical pointing to one page and the same content ( which happened by mistake ) I removed the canonical URL and added one pointing to its page. Also, I added the original content that was supposed to be there to begin with. It's been weeks but those pages are not getting indexed on the SERPS while the one that they use to point with the canonical does.
Technical SEO | | AngelosS0 -
New pages need to be crawled & indexed
Hi there, When you add pages to a site, do you need to re-generate an XML site map and re-submit to Google/Bing? I see the option in Google Webmaster Tools under the "fetch as Google tool" to submit individual pages for indexing, which I am doing right now. Thanks,
Technical SEO | | SSFCU
Sarah0 -
Website not crawled
i added website www.nsale.in in add campaign, it shows only 1 page crawled. but its working fine for other sites, any idea why it failed ?
Technical SEO | | Dhinesh0 -
Pages crawled is only 23 even after 8 days??
Hello all, My site www.practo.com has at least more than 500+ pages. Still seomoz says its only 23 crawled till date even after 8 -10 days of the trial period. Now most of the pages on my site are in-site search pages. They appear when you search relevant terms with combinations etc. Is that hindering the moz crawler to look for those pages? Aditya
Technical SEO | | shanky11 -
What can be the cause of my inner pages ranking higher than my home page?
If you do a search for my own company name or products we sell the inner pages rank higher than the homepage and if you do a search for exact content from my home page my home page doesn't show in the results. My homepage shows when you do a site: search so not sure what is causing this.
Technical SEO | | deciph220 -
Problem wth Crawling
Hello, I have a website http://digitaldiscovery.eu here in SEOmoz. Its strange since the last week SEOmoz is crawling only one page! And before it was crwaling all the pages. Whats happening? Help SEOmoz! :))
Technical SEO | | PedroM0 -
Too Many On-Page Links
Hello. My Seomoz report this week tells me that I have about 500 pages with Too Many On-Page Links One of the examples is this one: https://www.theprinterdepo.com/hp-9000mfp-refurbished-printer (104 links) If you check, all our products have a RELATED products section and in some of them the related products can be UP to 40 Products. I wonder how can I solve this. I thought that putting nofollow on the links of the related products might fix all of these warnings? Putting NOFOLLOW does not affect SEO?
Technical SEO | | levalencia10