Interest in optimise Google Crawl
-
Hello,
I have an ecommerce site with all pages crawled and indexed by Google.
But I have some pages with multiple urls like : www.sitename.com/product-name.html and www.sitename.com/category/product-name.html
There is a canonical on all these pages linking to the simplest url (so Google index only one page). So the multiple pages are not indexed, but Google still comes crawling them.
My question is : Did I have any interest in avoiding Google to crawl these pages or not ?
My point is that Google crawl around 1500 pages a day on my site, but there are only 800 real pages and they are all indexed on Google. There is no particular issue, so is it interesting to make it change ?
Thanks
-
Hi!
Have you no indexed the pages too? That may help to make sure that they aren't being crawled if that's concerning you. May at least give Google another signal not to crawl those pages.
Obviously it's not a catch all as there's only so much you can do to tell Google not to crawl a page. Sometimes if the alternative page is linked to internally (which it sounds like it is), then it will automatically crawl it even though you've said it has a canonical on it as you're showing that the page is important to your site.
May be worth testing a few pages to see if it has an impact.
-
Hi there!
From my experience, the best results I was ever able to achieve for a Client is when we consolidated all URLs to a single URL solution. Canonicals are amazing, no doubt. But I've experienced a canonical structure being ignored if there are instances where the canonical structure isn't 100% 'correct.'
If there is a way that you can have your website navigation & internal/XML sitemap reinforce your preferred URL, that would certainly reduce the number of URLs Google would crawl. Then, if you permanently (301) redirect all the now non-navigable URLs to the single preferred URL, you should see a significant boost in traffic (from consolidating all of the authority into a single page, now reinforced throughout your entire website).
If that's not possible, and you have to have multiple URLs within your site for budget/platform constraints, then yes, let Google crawl them. Otherwise the algo won't be able to see your canonical tag across them.
So in short: If you have a means to reduce the number of duplicates and redirect them - awesome. If you don't have a means to reduce duplicates, opening them up to Google is good, too.
For more information on making sure your canonical structure is set up properly, check out this Moz blog post: https://moz.com/blog/rel-confused-answers-to-your-rel-canonical-questions
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl Stats Decline After Site Launch (Pages Crawled Per Day, KB Downloaded Per Day)
Hi all, I have been looking into this for about a month and haven't been able to figure out what is going on with this situation. We recently did a website re-design and moved from a separate mobile site to responsive. After the launch, I immediately noticed a decline in pages crawled per day and KB downloaded per day in the crawl stats. I expected the opposite to happen as I figured Google would be crawling more pages for a while to figure out the new site. There was also an increase in time spent downloading a page. This has went back down but the pages crawled has never went back up. Some notes about the re-design: URLs did not change Mobile URLs were redirected Images were moved from a subdomain (images.sitename.com) to Amazon S3 Had an immediate decline in both organic and paid traffic (roughly 20-30% for each channel) I have not been able to find any glaring issues in search console as indexation looks good, no spike in 404s, or mobile usability issues. Just wondering if anyone has an idea or insight into what caused the drop in pages crawled? Here is the robots.txt and attaching a photo of the crawl stats. User-agent: ShopWiki Disallow: / User-agent: deepcrawl Disallow: / User-agent: Speedy Disallow: / User-agent: SLI_Systems_Indexer Disallow: / User-agent: Yandex Disallow: / User-agent: MJ12bot Disallow: / User-agent: BrightEdge Crawler/1.0 (crawler@brightedge.com) Disallow: / User-agent: * Crawl-delay: 5 Disallow: /cart/ Disallow: /compare/ ```[fSAOL0](https://ibb.co/fSAOL0)
Intermediate & Advanced SEO | | BandG0 -
Google does not want to index my page
I have a site that is hundreds of page indexed on Google. But there is a page that I put in the footer section that Google seems does not like and are not indexing that page. I've tried submitting it to their index through google webmaster and it will appear on Google index but then after a few days it's gone again. Before that page had canonical meta to another page, but it is removed now.
Intermediate & Advanced SEO | | odihost0 -
AngularJS - How does Google go?
We're rebuilding our entire website in angularJS. We've got it rendering fine in WMT, but does that mean that it's content is detectable? I've looked into prerender.io and that seems like a great solution to the problem of not seeing any static HTML, but is it really necessary? I'm looking into this as I'm having the argument currently with my devs, and they're all certain that Google renders angularJS fine.
Intermediate & Advanced SEO | | localdirectories0 -
Website No Longer Ranking In Google:
My website was on first page google couple of months ago, now nothing. Shows up in Bing page one. Some queries/pages still showing OK, but some not at all. Example "residential elevators illinois" found nowhere. http://www.accesselevator.net is the website. Have found 900 poor quality links and used disavow tool. Any further suggestions? Their Page Rank also went from a 3 to a 2. Implemented nofollow on all outgoing links. Need advice.
Intermediate & Advanced SEO | | trailblazerzz90 -
Missing Suite Number on Google
I realized that we are missing a suite number. It is not on the website or the recently updated Google/Bing/Yahoo revisions I did. Should I go and fix? Or should I go and adjust old listings. Does a suite number matter in the NAP?
Intermediate & Advanced SEO | | greenhornet770 -
Unable to Crawl my Website
Hi all, I have a website that I am trying to promote, but tried to add it here in SEOMoz and got the following message: We have detected that the root domain evolving-networks.co.uk does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information. Does anyone know why this website cannot be crawled? Please help. Thank you in advance!
Intermediate & Advanced SEO | | LSDigital0 -
Random Google?
In 2008 we performed an experiment which showed some seemingly random behaviour by Google (indexation, caching, pagerank distributiuon). Today I put the results together and analysed the data we had and got some strange results which hint at a possibility that Google purposely throws in a normal behaviour deviation here and there. Do you think Google randomises its algorithm to prevent reverse engineering and enable chance discoveries or is it all a big load balancing act which produces quasi-random behaviour?
Intermediate & Advanced SEO | | Dan-Petrovic0 -
Google Places verification
What advice do you have for achieving verification for Google Places for a client? I have a client at the moment and I tried getting the call sent through and I'm not sure what happened but a couple of tries at this did not work. I've tried the post card way and I'm still waiting. Do I need to be more patient in Australia for this verification post card? Is there a way I can verify the info myself? note: I have set up a seperate email that there business email to handle a lot of the link building but this is different to there business email which Google uses.
Intermediate & Advanced SEO | | iSenseWebSolutions0