Crawl Budget on Noindex Follow
-
We have a list of crawled product search pages where pagination on Page 1 is indexed and crawled and page 2 and onward is noindex, noarchive follow as we want the links followed to the Product Pages themselves. (All product Pages have canonicals and unique URLs) Orr search results will be increasing the sets, and thus Google will have more links to follow on our wesbite although they all will be noindex pages. will this impact our carwl budget and additionally have impact to our rankings?
Page 1 - Crawled Indexed and Followed
Page 2 onward - Crawled No-index No-Archive Followed
Thoughts?
Thanks,
Phil G
-
Check out Google's latest "handling" of pagination using rel=canonical, rel=next + rel=prev
http://www.youtube.com/watch?v=njn8uXTWiGgYou can now:
Page 1:
- canonical: page 1
- next: page 2
Page 2:
- canonical: page 2
- next: page 3
- prev: page 1
Page 3:
- canonical: page 3
- next: page 4
- prev: page 2
Page 4 (say last page):
- canonical: page 4
- prev : page 3
Another option is to have a "view all" page which lists all products & you can point a canonical to that page from all pages within the set
Hope that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Internal Clicks and CTR. Is REL=canonical better than Noindex in this case?
I currently have a search facility in a website that noindexes the search results which is ok. But when you click one of the results it takes you to a product which is noindexes as it has URL params. e.g. https://www.visitliverpool.com/accommodation/albion-guest-house-p305431?bookurl=%2Fbook-online%3Fstage%3Dunitsel%26isostartdate%3D2017-10-31%26nights%3D1%26roomReq_1_adults%3D1%26NumRoomReqs%3D1%26fuzzy%3D0%26product%3D305431 The product also exists as this which is indexed : - https://www.visitliverpool.com/accommodation/albion-guest-house-p305431 Should I canonicalise is this instance instead of no index? Does CTR apply to internal links? i.e. Does search console consider internal clicks? Are internal clicks a ranking factor?
Intermediate & Advanced SEO | | Andrew-SEO0 -
Should I "NoIndex" Pages with Almost no Unique Content
I have a real estate site with MLS data (real estate listings shared across the Internet by Realtors, which means data exist across the Internet already). Important pages are the "MLS result pages" - the pages showing thumbnail pictures of all properties for sale in a given region or neighborhood. 1 MLS result page may be for a region and another for a neighborhood within the region:
Intermediate & Advanced SEO | | khi5
example.com/region-name and example.com/region-name/neighborhood-name
So all data on the neighborhood page will be 100% data from the region URL. Question: would it make sense to "NoIndex" such neighborhood page, since it would reduce nr of non-unique pages on my site and also reduce amount of data which could be seen as duplicate data? Will my region page have a good chance of ranking better if I "NoIndex" the neighborhood page? OR, is Google so advanced they know Realtors share MLS data and worst case simple give such pages very low value, but will NOT impact ranking of other pages on a website? I am aware I can work on making these MLS result pages more unique etc, but that isn't what my above question is about. thank you.0 -
How to stop pages being crawled from xml feed?
We have a site that has an xml feed going out to many other sites.
Intermediate & Advanced SEO | | jazavide
The xml feed is behind a password protected page so cannot use a cannonical link to point back to original url. How do we stop the pages being crawled on all of the sites using the xml feed? as with hundreds using it after launch it will cause instant duplicate content issues? Thanks0 -
Should I NoIndex NoFollow my BUYNOW page?
Hi, As stated in the title, I am wondering if I should NOINDEX NOFOLLOW my shopping cart page - it is actually a buy now page that receives in the URL the Item ID - only one item per purchase. I received duplication errors so now I added canonical and I wonder if I should simply remove it altogether. Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
When I try creating a sitemap, it doesnt crawl my entire site.
We just launched a new Ruby app at (used to be a wordpress blog) - http://www.thesquarefoot.com We have not had time to create an auto-generated sitemap, so I went to a few different websites with free sitemap generation tools. Most of them index up to 100 or 500 URLS. Our site has over 1,000 individual listings and 3 landing pages, so when I put our URL into a sitemap creator, it should be finding all of these pages. However, that is not happening, only 4 pages seem to be seen by the crawlers. TheSquareFoothttp://www.thesquarefoot.com/http://www.thesquarefoot.com/users/sign_inhttp://www.thesquarefoot.com/searchhttp://www.thesquarefoot.com/renters/sign_upThis worries me that when Google comes to crawl our site, these are the only pages it will see as well. Our robots.txt is blank, so there should be nothing stopping the crawlers from going through the entire site. Here is an example of one of the 1,000s of pages not being crawled****http://www.thesquarefoot.com/listings/Houston/TX/77098/Central_Houston/3910_Kirby_Dr/Suite_204Any help would be much appreciated!
Intermediate & Advanced SEO | | TheSquareFoot0 -
HELP - got the following message - Google Webmaster Tools notice of detected unnatural links
Hi All, While trying to grow we used several freelancers and small companies for guest blogging, article submissions etc. We lost about 90% of traffic from our peek at December. We don't know if it is related but we got the following message last week:
Intermediate & Advanced SEO | | BeytzNet
"Google Webmaster Tools notice of detected unnatural links to www.domain.com" Is it related (getting this message after two months of losing traffic)? What to do???? (P.S
We fired most of the companies we used months ago since we noticed they used bad methods. We didn't believe it can hurt us - just thought it would be useless...) Please Help...0 -
NOINDEX content still showing in SERPS after 2 months
I have a website that was likely hit by Panda or some other algorithm change. The hit finally occurred in September of 2011. In December my developer set the following meta tag on all pages that do not have unique content: name="robots" content="NOINDEX" /> It's been 2 months now and I feel I've been patient, but Google is still showing 10,000+ pages when I do a search for site:http://www.mydomain.com I am looking for a quicker solution. Adding this many pages to the robots.txt does not seem like a sound option. The pages have been removed from the sitemap (for about a month now). I am trying to determine the best of the following options or find better options. 301 all the pages I want out of the index to a single URL based on the page type (location and product). The 301 worries me a bit because I'd have about 10,000 or so pages all 301ing to one or two URLs. However, I'd get some link juice to that page, right? Issue a HTTP 404 code on all the pages I want out of the index. The 404 code seems like the safest bet, but I am wondering if that will have a negative impact on my site with Google seeing 10,000+ 404 errors all of the sudden. Issue a HTTP 410 code on all pages I want out of the index. I've never used the 410 code and while most of those pages are never coming back, eventually I will bring a small percentage back online as I add fresh new content. This one scares me the most, but am interested if anyone has ever used a 410 code. Please advise and thanks for reading.
Intermediate & Advanced SEO | | NormanNewsome0 -
Old pages still crawled by SE returning 404s. Better to put 301 or block with robots.txt ?
Hello guys, A client of ours has thousand of pages returning 404 visibile on googl webmaster tools. These are all old pages which don't exist anymore but Google keeps on detecting them. These pages belong to sections of the site which don't exist anymore. They are not linked externally and didn't provide much value even when they existed What do u suggest us to do: (a) do nothing (b) redirect all these URL/folders to the homepage through a 301 (c) block these pages through the robots.txt. Are we inappropriately using part of the crawling budget set by Search Engines by not doing anything ? thx
Intermediate & Advanced SEO | | H-FARM0