Max Amout Of HTML Pages In A Folder
-
What's the maximum amount of html pages that one should put in a folder, to get the best SEO GoggleBot crawl? I'm aware that there's a limit of 10,000 on most servers, but was curious to know if a lesser amount of pages would be better, for crawling and indexing purposes. Also curious on peoples opinions on whether .jpg and .gif files should follow similiar rules.
-
Thanks for all the input. Google does seem to crawl everything these days, so I'm also in conclusion if the files fit, they'll get crawled. Sitemaps, internal links and optimized images are all a must.
-
For images, you want to make sure they're optimized for the web: small file sizes for easy download, but still a resolution that shows the image clearly. Your graphic designer and a good graphic design program (Photoshop, Gimp, etc.) should help with this.
-
Hi,
As Ray-pp said, there isn't an optimal number of pages that are going to serve you better.
However, if you want to help Google discover more about your site and pages of importances, look to create a good internal linking strategy. This doesn't mean that you should just add footer or sidebar links though - these are context links that talk about a different subject, along with a link to the appropriate page.
If you get this right, you can gain a lot in terms of Google understanding more about what you have to offer, and the links to primary pages can also lead to an increase in the SERPs for various phrases.
-Andy
-
AFAIK there is no efficient # of files to include in a folder directory for maximum crawl effectiveness. If you folder legitimately warrants 5k html pages in a directory, then Google will crawl all the pages. Make sure to create value-added pages with high quality content - Google will recognize them and crawl them as appropriate.
If you have the options, use your Google Webmaster Tools account to adjust crawl settings. Once your site is a specific size, Google will take-over crawl rate settings for you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Article page canonicalization
Hey there, A client rents all kinds of party articles, like plates, bowles, etc. Currently, al his article pages have canonicals to their parent category pages, supposedly to have any pagevalue flow to these category pages, (which are much more relevant for SEO). Is there anyone who agrees with this method? I think a noindex,follow would be a better measure to prevent Google from accessing all these 'low value' article pages. Besides, a canonical should indicate that page A and B are (almost) identical, which they most certainly are not in this case. What are your thoughts?
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
How does Googlebot evaluate performance/page speed on Isomorphic/Single Page Applications?
I'm curious how Google evaluates pagespeed for SPAs. Initial payloads are inherently large (resulting in 5+ second load times), but subsequent requests are lightning fast, as these requests are handled by JS fetching data from the backend. Does Google evaluate pages on a URL-by-URL basis, looking at the initial payload (and "slow"-ish load time) for each? Or do they load the initial JS+HTML and then continue to crawl from there? Another way of putting it: is Googlebot essentially "refreshing" for each page and therefore associating each URL with a higher load time? Or will pages that are crawled after the initial payload benefit from the speedier load time? Any insight (or speculation) would be much appreciated.
Intermediate & Advanced SEO | | mothner1 -
Help! Website Page Structure.
Hi there, I have a cupcake website; www.cupcakesdelivered.com.au To date, we have sold only regular cupcakes. Moving forward, we are about to start selling lots of different sorts of cupcakes and want to categorise them - i.e.; sport cupcakes, corporate cupcakes, movie-themed cupcakes etc. I am looking for a recommendation on how best to structure this in terms of pages / domains / subdomains etc, so as to best support SEO. Your help would be greatly appreciated!! Thank you, Laura.
Intermediate & Advanced SEO | | cupcakesdelivered0 -
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page?
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page? If I have 4 or 5 different hashtag link section pages , consolidated into one HTML Page, no chance to get one of the Hashtag Pages to appear as a search result? like, if under one Single Page Travel Guide I have two essential sections: #Attractions #Visa no chance to direct search queries for Visa directly to the Hashtag Link Section of #Visa? Thanks for any help
Intermediate & Advanced SEO | | Muhammad_Jabali0 -
Sudden Change In Indexed Pages
Every week I check the number of pages indexed by google using the "site:" function. I have set up a permanent redirect from all the non-www pages to www pages. When I used to run the function for the: non-www pages (i.e site:mysite.com), would have 12K results www pages (i.e site:www.mysite.com) would have about 36K The past few days, this has reversed! I get 12K for www pages, and 36K for non-www pages. Things I have changed: I have added canonical URL links in the header, all have www in the URL. My questions: Is this cause for concern? Can anyone explain this to me?
Intermediate & Advanced SEO | | inhouseseo0 -
Corporate pages and SEO help
We own and operate more than two dozen educational related sites. The business team is attempting to standardize some parts of our site hierarchy so that our sitemap.php, about.php, privacy.php and contact.php are all at the root directory. Our sitemap.php is generated by our sitemap.xml files, which are generated from our URLlist.txt files. I need to provide some feedback on this initiative. I'm worried about adding more stand-alone pages to our root directory and as part of a separate optimization in the future I was planning to suggest we group the "privacy", "about" and "contact" pages in a separate folder. We generally try to put our most important pages/directories for SEO in the root as our homepages pass a lot of link juice and have high authority. We do not invest SEO time into optimizing these pages as they're not pages we're trying to rank for, and I've already been looking into even no-following all links to them from our footer, sitemap, etc. I know that adding these "corporate" pages to a site are usually a standard part of the design process but is there any SEO benefit to having them at the root? And along the same lines, is there any SEO harm to having unimportant pages at the root? What do you guys think out there in Moz land?
Intermediate & Advanced SEO | | Eric_edvisors0 -
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash.
Our site is recieving traffic for both .com/page and .com/page/ with the trailing slash. Should we rewrite to just the trailing slash or without because of duplicates. The other question is, if we do a rewrite, google has indexed some pages with the slash and some without - i am assuming we will lose rank for one of them once we do the rewrite, correct?
Intermediate & Advanced SEO | | Profero0 -
How To 301 Redirect .html pages
I need to redirect a page/URL that is purely .html to a new location. I don't know how to do this. All the redirects I can find are for server side code pages .php/.aspx etc. From my understanding I can't put a server side redirect in a .html file. I am hosting on a microsoft server, however the new page I am redirecting to is .php. I am running some WordPress (.php) files on the server. I need to make it redirect before the old page loads so visitors don't start reading something that is about to get redirected Can someone please help me?
Intermediate & Advanced SEO | | MyNet0