For large sites, best practices for pages hidden behind internal search?
-
If a website has 1M+ pages, with most of them being hidden behind an internal search, what's the best way to get pages included in an engine's index?
Does a direct clickpath to those pages need to exist from the homepage or other major hub pages on the site?
Is submitting an XML sitemap enough?
-
Hello Vlevit,
You could do several things. I recommend giving Google your product feed, which should accomplish your goals. Another possible solution would be to make those search pages noindex,follow so they don't end up getting indexed, but Google can still use them for discovery.
Thanks for explaining the situation.
Below is more on submitting product feeds. It is for Google Product Search, but I would imagine the "link" field where you put the URL to your product detail page will help those pages get indexed in the standard results:
http://support.google.com/merchants/bin/answer.py?hl=en&answer=188494#USEverett
-
Everett, thanks for your reply. I understand the problems of showing internal search pages. I'm not looking to have internal search results being indexed, just the pages that the results link to. We're in eCommerce.
I was under the impression that there was a clever way to have the individual product pages indexed without establishing a direct click path, but best practices recommend otherwise.
Question answered. Thanks all for your help.
-
Hello Vlevit,
If you can be more specific we may be able to be of more help. Google doesn't want you to show internal search result pages, but if this is a different type of situation it there may be an exception. Are these search result pages, product pages, category pages, content pages.... is it an eCommerce site, community, content site... ?
Generally speaking, 1M+ pages with no links going into them and content that is either sparce/thin or partially/fully duplicated on other similar pages (like a search for widgets and a search for green widgets showing overlapping content) is exactly the type of thing that will get you in hot water that would affect even the rankings of your home page.
Do you feel like your question has been answered or would you like to be more specific about your site and goals?
Cheers,
Everett
-
This is what I was assuming, but was wondering if there was a clever way around creating direct click paths to those pages, while still maintaining their importance to the site. Thanks for the info.
-
Make sure they are part of the actual structure of your website, not just part of search. Meaning, you have to have links pointing at them. Also, you will also want to make sure that those pages have value.
-
Hi vlevit,
The best practice would be to exist a direct path of flow from index page. Something like: index -> category(filter) -> subcategory(filter) -> page/product. But in some cases xml sitemaps can also help you in indexing.
BUT, beware with to large XML sitemaps, try to create more then one sitemap, group them as possible.
A few very good resources can be found under the next links:
http://www.seomoz.org/ugc/solving-new-content-indexation-issues-for-large-b2b-websites
http://www.seomoz.org/qa/view/29009/sitemaps-management-for-big-sites-tens-of-millions-of-pages
I hope it helpes,
Istvan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Trying to find all internal links to a specific page (without index)
Hi guys -- Still waiting on Moz to index a page of mine. We launched a new site over two months ago. In the meantime, I really just need a list of internal links to a specific page because I want to change its URL. Does anybody know how to find that list (of internal links to 1 of my pages) without the Moz index? I appreciate the help!
Technical SEO | | marchexmarketingmcc1 -
404 page cannot be created - Best solution?
Hi all, i am helping a frind with his page, he is very shot on money and cannot spend a dime on programers or learn how to create a 404 page.
Technical SEO | | Gaston Riera
His web is in php laravel, also i dont know how to create one. My options are: Leave the 404 page to be just like that. Redirect, via .htaccess to homepage. What should recommend him to do? Thanks!
GR.0 -
Can I use high ranking sites to push my competitors out of the first page of search results?
I'm looking at a bunch of long tail low traffic keywords that aren't difficult to rank for. As I was idly doing a boring task my mind wandered and I thought.... Why don't I ask lots of questions about these keywords on sites such as Moz, Quora, Reddit etc where the high DA will get them to rank for the search term? The results on a SEO site or Q&A site won't be relevant and so I'd starve my competitors of some of their leads. Of course I'm not sure the effort would be worth it but would it work? (and no, none of my long tail keywords are included in this post)
Technical SEO | | Zippy-Bungle3 -
600+ 404 Errors: Best Practice for Redirects?
Hi All, I've just checked my GWMT profile for one of my client's sites and found that there are currently over 600 404 Error notifications! This is not that surprising given that we very recently redesigned and launched their new corporate site, which previously had a ton of "junk" legacy pages. I was wondering if it would work in terms of efficient SEO to simply apply a 301 redirect from the 404 page to our root to solve this issue? If not what would be a good solution? Thanks in advance for all your great advice!
Technical SEO | | G2W1 -
Are aggregate sites penalised for duplicate page content?
Hi all,We're running a used car search engine (http://autouncle.dk/en/) in Denmark, Sweden and soon Germany. The site works in a conventional search engine way with a search form and pages of search results (car adverts).The nature of car searching entails that the same advert exists on a large number of different urls (because of the many different search criteria and pagination). From my understanding this is problematic because Google will penalize the site for having duplicated content. Since the order of search results is mixed, I assume SEOmoz cannot always identify almost identical pages so the problem is perhaps bigger than what SEOmoz can tell us. In your opinion, what is the best strategy to solve this? We currently use a very simple canonical solution.For the record, besides collecting car adverts AutoUncle provide a lot of value to our large user base (including valuations on all cars) . We're not just another leech adword site. In fact, we don't have a single banner.Thanks in advance!
Technical SEO | | JonasNielsen0 -
Please recommend a tool to list pages on my site.
I have taken a major hit from the latest update. Site has been online for 10 years, white hat SEO all the way but I do have some legacy pages were I would duplicate title or the description on a new page. Things are just unorganized currently and trying to find the best approach to organizing what I already have as well as track new content. I would like to have a tool that would basically extract a list of my current pages, the title tags and the description in an Excel file. Not sure how the pros organinze the SEO on a site but my biright idea is that I can have a large excel file with the pages listed so I can detect duplicate info. Site only has about 300 pages. Just regular php pages, no CMS. Thanks in advance!
Technical SEO | | Force70 -
How To Find and Delete Erroneous Pages From My Wordpress Site
I've downloaded the Seomoz csv file from the crawl data on my site and it found lots of 404 errors, duplicate content, etc. The problem is that when i go to my wp-admin and look for the pages to delete them, I dont see them. Can anyone point me in the right direction? I've checked with HostGator and they say it's a WP problem. I need help locating where they are so i can clean them up or delete them. Thanks Mike
Technical SEO | | mikemunter0 -
Prevent mobile site from appearing in the sitelinks of desktop search
Hi, IWe have this mobile page that keeps on appearing in the google search. I even try to put it in the robots.txt to disallow the crawler but still it keeps on popping on the search results. How can I prevent it from displaying?
Technical SEO | | shebinhassan0