Interest in optimise Google Crawl
-
Hello,
I have an ecommerce site with all pages crawled and indexed by Google.
But I have some pages with multiple urls like : www.sitename.com/product-name.html and www.sitename.com/category/product-name.html
There is a canonical on all these pages linking to the simplest url (so Google index only one page). So the multiple pages are not indexed, but Google still comes crawling them.
My question is : Did I have any interest in avoiding Google to crawl these pages or not ?
My point is that Google crawl around 1500 pages a day on my site, but there are only 800 real pages and they are all indexed on Google. There is no particular issue, so is it interesting to make it change ?
Thanks
-
Hi!
Have you no indexed the pages too? That may help to make sure that they aren't being crawled if that's concerning you. May at least give Google another signal not to crawl those pages.
Obviously it's not a catch all as there's only so much you can do to tell Google not to crawl a page. Sometimes if the alternative page is linked to internally (which it sounds like it is), then it will automatically crawl it even though you've said it has a canonical on it as you're showing that the page is important to your site.
May be worth testing a few pages to see if it has an impact.
-
Hi there!
From my experience, the best results I was ever able to achieve for a Client is when we consolidated all URLs to a single URL solution. Canonicals are amazing, no doubt. But I've experienced a canonical structure being ignored if there are instances where the canonical structure isn't 100% 'correct.'
If there is a way that you can have your website navigation & internal/XML sitemap reinforce your preferred URL, that would certainly reduce the number of URLs Google would crawl. Then, if you permanently (301) redirect all the now non-navigable URLs to the single preferred URL, you should see a significant boost in traffic (from consolidating all of the authority into a single page, now reinforced throughout your entire website).
If that's not possible, and you have to have multiple URLs within your site for budget/platform constraints, then yes, let Google crawl them. Otherwise the algo won't be able to see your canonical tag across them.
So in short: If you have a means to reduce the number of duplicates and redirect them - awesome. If you don't have a means to reduce duplicates, opening them up to Google is good, too.
For more information on making sure your canonical structure is set up properly, check out this Moz blog post: https://moz.com/blog/rel-confused-answers-to-your-rel-canonical-questions
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google is showing 404 error. What should I do?
Dear Experts, Though few of my website pages are accessible, Google is showing 404 error. What should I do? Even moz reports gives me the same. Problems:
Intermediate & Advanced SEO | | Somanathan
1. Few of my Pages are not yet catched in Google. (Earlier all of them were catched by Google)
2. Tried to fetch the those pages, but Google says, page not found.
3. Included them in sitemap, the result is the same. Please advice: Note: I have recently changed my hosting server.0 -
Google is mixing subdomains. What can we do?
Hi! I'm experiencing something that's kind of strange for me. I have my main domain let's say: www.domain.com. Then I have my mobile version in a subdomain: mobile.domain.com and I also have a german version of the website de.domain.com. When I Google my domain I have the main result linking to: www.domain.com but then Google mixes all the domains in the sites links. For example a Sing in may be linking mobile.domain.com, a How it works link may be pointing to de.domain.com, etc What's the solution? I think this is hurting a lot my position cause google sees that all are the same domain when clearly is not. thanks!!
Intermediate & Advanced SEO | | fabrizzio0 -
Indexed Pages in Google, How do I find Out?
Is there a way to get a list of pages that google has indexed? Is there some software that can do this? I do not have access to webmaster tools, so hoping there is another way to do this. Would be great if I could also see if the indexed page is a 404 or other Thanks for your help, sorry if its basic question 😞
Intermediate & Advanced SEO | | JohnPeters0 -
Google alerts for products and categories?
I get daily Google alerts for our site and a competitor's site. I have noticed that I am getting multiple alerts a day from Google about products and product categories on the competitor's site. Every now and then there's an actual alert for a linking blog post or something else. How is Google noticing new product on this site but has never done the same for ours? Is there some kind of strategy involved here that I don't know about? The site is http://bit.ly/Q0o2ob
Intermediate & Advanced SEO | | IanTheScot0 -
Is it safe to not have a sitemap if Google is already crawling my site every 5-10 min?
I work on a large news site that is constantly being crawled by Google. Googlebot is hitting the homepage every 5-10 minutes. We are in the process of moving to a new CMS which has left our sitemap nonfunctional. Since we are getting crawled so often, I've met resistance from an overwhelmed development team that does not see creating sitemaps as a priority. My question is, are they right? What are some reasons that I can give to support my claim that creating an xml sitemap will improve crawl efficiency and indexing if we are already having new stories appear in Google SERPs within 10-15 minutes of publication? Is there a way to quantify what the difference would be if we added a sitemap?
Intermediate & Advanced SEO | | BostonWright0 -
What is a good content for google?
When we start to study SEO and how google see our webpage, one important point is to have good content. But, for beginners like me, we get lost on this. Is not so black and white: what for you is a good content? the text amount matters? there is any trick that all good content websites need to have?
Intermediate & Advanced SEO | | Naghirniac0 -
Does Google crawl the pages which are generated via the site's search box queries?
For example, if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled? I am asking this question because this would be a URL that is non existent on the site and hence am confused as to whether Google bots would be able to find it.
Intermediate & Advanced SEO | | pulseseo0 -
Recommendation to fix Google backlink anchor text over optimisation filter penalty (auto)
Hi guys, Some of you may have seen a previous question I posted regarding a new client I started working with. Essentially the clients website steadily lost all non domain name keyword rankings over a period of 4-12 weeks, despite content changes and various other improvements. See following:: http://www.seomoz.org/q/shouldn-t-google-always-rank-a-website-for-its-own-unique-exact-10-word-content-such-as-a-whole-sentence After further hair pulling and digging around, I realised that the back link anchor text distribution was unnatural for its homepage/root. From OSE, only about 55/700 of links anchor text contain the clients domain or company name!....8%. The distribution of the non domain keywords isn’t too bad (most repeated keyword has 142 links out of the 700). This is a result of the client submitting to directories over the last 3 years and just throwing in targeted keywords. Is my assumption that it is this penalty/filter correct? If it is I guess the lesson is that domain name anchor texts should make up more of your links? MY QUESTION: What are some of the effective ways I can potentially remove this filter and get the client ranking on its homepage again? Ensure all new links contain the company name?
Intermediate & Advanced SEO | | Qasim_IMG
Google said there was no manual penalty, so not sure if there’s any point submitting another reconsideration request? Any advice or effective experiences where a fix has worked would be greatly appreciated! Also, if we assume company is "www.Bluewidget.com", what would be the best way to link most naturally: Bluewidget
Blue widget
Blue widget .com
www.bluewidget.com
http://www.bluewidget.com....etc I'm guessing a mix of the above, but if anyone could suggest a hierarchy that would be great.0