What to do with extremely high number of URLs on your site?
-
Here is the situation:
The site has tons of business and personal profiles, the information needed to be categorized as such directories were created in an attempt to keep the URL structure clean - so for example:
www.abc.com/product/um/name-here/city-name/state/lastname:3458765
Each profile has a unique ID#, and for some reason there needed to be a category for a user in this case /um/ stands for user name.
Webmaster tool steps to resolve state to use an rel=canonical which can be done for that directory /um/ but I am concerned about the bot not being able to find the other pages beyond that directory, like the profile name, city, state associated. So I guess my ultimate question is if I use rel=canonical will the rest of the content not get crawled or indexed as well?
-
This is not what the canonical tag is intended for.
The personal profiles will most likely be very low content dupes of each other like these which are indexed and should not be:
if pages deeper in that folder are good content worthy of being indexed then:
a) add noindex,follow to these profile pages
b) add index, follow to the deeper pages
that will keep the bots crawling the profile pages to the deeper folders with content you want indexed.
You can also disallow the /un/ (user name) folder and allow the deeper folders with robots.txt commands. We were just discussing this:
http://www.seomoz.org/q/allow-or-disallow-first-in-robots-txt
-
Does everything need to be indexed? If not, perhaps the personal profiles could be noindexed. Let the search engines crawl all of your content, but only have them index pages that provide value to the SERPs.\
Only use rel=canonical if the content on different URLs is the exact same. Using it incorrectly will cause content to not be indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New SEO manager needs help! Currently only about 15% of our live sitemap (~4 million url e-commerce site) is actually indexed in Google. What are best practices sitemaps for big sites with a lot of changing content?
In Google Search console 4,218,017 URLs submitted 402,035 URLs indexed what is the best way to troubleshoot? What is best guidance for sitemap indexation of large sites with a lot of changing content? view?usp=sharing
Technical SEO | | Hamish_TM1 -
Which url should i use? Thanks!
I have a question regarding how to use my url, we are a Swedish-based website which have the url, http://interimslösning.se/ (that contains the Swedish letter “ö”) so the url can also be written as http://xn--interimslsning-3pb.se/. Which of the following url should I use for my backlinks, http://interimslösning.se/ or http://xn--interimslsning-3pb.se/ ? What is the difference between them regarding SEO? And is it good or bad to use letter like "ö" or other characters like that in your url? I was thinking that maybe it is good to use the letter "ö" for local search optimization in sweden, but i don't know.. Thanks in advance! Greetings,
Technical SEO | | Kiwibananlime
Paul Linderoth0 -
Does my "spam" site affect my other sites on the same IP?
I have a link directory called Liberty Resource Directory. It's the main site on my dedicated IP, all my other sites are Addon domains on top of it. While exploring the new MOZ spam ranking I saw that LRD (Liberty Resource Directory) has a spam score of 9/17 and that Google penalizes 71% of sites with a similar score. Fair enough, thin content, bunch of follow links (there's over 2,000 links by now), no problem. That site isn't for Google, it's for me. Question, does that site (and linking to my own sites on it) negatively affect my other sites on the same IP? If so, by how much? Does a simple noindex fix that potential issues? Bonus: How does one go about going through hundreds of pages with thousands of links, built with raw, plain text HTML to change things to nofollow? =/
Technical SEO | | eglove0 -
Friendly URLs for MultiLingual Site
Hi, We have a multilingual website with both latin and non-latin characters, We are working on creating a friendly URL structure for the site. For the Latin languages can we use translated version of the URLs within the language folders? For example - www.site/cars www.site/fr/voitures www.site/es/autos
Technical SEO | | theLotter0 -
Why my site is not indexing in google
In google webmaster i have updated my sitemap in Mar 6th..There is around 22000 links..But google fetched only 5300 links for long time...
Technical SEO | | Rajesh.Chandran
I waited for 1 month till no improvement in google index..So apr6th we have uploaded new sitemap (1200 links totally)..,But only 4 links indexed in google ..
why google not indexing my urls? Is this affect our ranking in SERP? How many links are advisable to submit in sitemap for a website?0 -
Canonical URL Issue
Hi Everyone, I'm fairly new here and I've been browsing around for a good answer for an issue that is driving me nuts here. I tried to put the canonical url for my website and on the first 5 or 6 pages I added the following script SEOMoz reported that there was a problem with it. I spoke to another friend and he said that it looks like it's right and there is nothing wrong but still I get the same error. For the URL http://www.cacaniqueis.com.br/video-caca-niqueis.html I used the following: <link rel="<a class="attribute-value">canonical</a>" href="http://www.cacaniqueis.com.br/video-caca-niqueis.html" /> Is there anything wrong with it? Many thanks in advance for the attention to my question.. 🙂 Alex
Technical SEO | | influxmedia0 -
How is this site doing this?
http://www.meccabingo.com It shows a splash / promotion page yet you check the cache and it's the real homepage, they are doing this so they don't lose rankings but how are they redirecting users to that but Google is caching the real homepage? is it friendly? thanks!!
Technical SEO | | AdiRste0 -
Site revision
our site has complete redesign including site architecture, page url and page content (except domain). It looks like a new site. The old site has been indexed about thirty thousand results by google. now what should i do first?
Technical SEO | | jallenyang0