Reason for robots.txt file blocking products on category pages?
-
Hi
I have a website with thosands of products. On the category pages, all the products are linked to with the code “?cgid” in the URL. But “?cgid” is also blocked in the robots.txt file for some reason. So I'm thinking it's stopping all my products getting crawled by Google.
Am I right here? Is there any reason why a website would want to limit so many URL's? I'm only here a week and the sites getting great traffic, so don't want to go breaking it!!!
Thanks
-
Thanks again AL123al!
I would be concerned about my internal linking because of this problem. I've always wanted to keep important pages within 3 clicks of the Homepage. My worry here is that while these products can get clicked by a user within 3 clicks of the Homepage, they're blocked to Googlebot.
So the product URLS are only getting crawled in the sitemap, which would be hugely ineffcient? So I think I have to decide whether opening up these pages will improve my linking structure for Google to crawl the product pages, but is that important than increasing the amount of pages it's able to crawl and wasting crawl budget?
-
Hello,
The canonical product URLS will be getting crawled just fine as they are not blocked in the robots.txt. Without understanding your problem completely, I think the guys before you were trying to stop all the duplicate URLS with parameters being crawled and just leaving Google to crawl the canonicals - which is what you want.
If you remove the parameter from robots.txt then Google will crawl everything including the parameter URLS. This will waste crawl budget. So better that Google is only crawling the canonicals.
Regarding the sitemap, being present on the sitemap will help Googlebot decide what to prioritise crawling but won't stop it finding other URLS if there is good internal linking.
-
Thanks AL123al! The base URL's (www.example.com/product-category/ladies-shoes) do seem to be getting crawled here & there, and some are ranking which is great. But I think the only place they can get crawled is the sitemap, which has has over 28,000 URLs on one page (another thing I need to fix)!
So if Googlebot gets to the parameter URL through category pages (www.example.com/product-category/ladies-shoes?cgid...) and sees it's blocked, I'm guessing it can't see it's important to us (from the website hierarchy) or the canonical tag, so I'm presuming it's seriously damaging or power in getting products ranked
In Screaming Frog, 112,000 get crawled and 68% are blocked by robots. 17,000 are URL's which contain "?cgid", which I don't think is too big for Googlebot to crawl, the websites has a pretty good authority so I think we have a pretty deep crawl.
So I suppose what really want to know is will removing "?cgid" from the robots file really damage the site? I my opinion, I think it'll really help
-
This looks like the products are being appended by a parameter ?cgid - there may be other stuff attached to the end of each URL like this below:
e.g. www.example.com/product-category/ladies-shoes?cgid-product=19&controller=product etc
but canonical URL is www.example.com/product-category/ladies-shoes
These products may have had a canonical to the base URL which means that there won't be any problem with duplicates being indexed. So all well and good.
Except.....Google has to crawl each of these parameter URLs to find the canonical. In a huge website this means that crawl budget is being consumed by unnecessary crawling of these parameterised URLs.
You can tell Google not to crawl the parameter URLs in search console (at least in the old version you can). But you can also stop Google crawling these URLS unnecessarily by blocking them in robots txt if you are sure that the parameters are not changing how the page is looking in search.
So long story short is that is why you may see that the URLS with parameters are being blocked in robots.txt. The canonical version URLS will be getting crawled just fine since they don't have any parameters and hence not being blocked.
Hope that makes sense?
-
Yes, it's in the robot.txt, that's the problem. Someone had to physically put it in there, but I've no idea why they would.
-
Did you check your robot txt file? Or check if any plugin creating this problem.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ecommerce web design read more toggle vs menu link on home page and product pages
Hello, We have an Ecommerce store. We have a lot of content on the home page and product pages and we are going back and forth between which one to use between a toggle "Read More" "Show Less" toggle for each section and a anchor linked menu. We have long product pages We're thinking a read more toggle is more appropriate for category descriptions so that they can go at the top of the category and not take up space. But the read more toggle with lots of content scrolls the page down and doesn't scroll it back up when you hit "show less" We're leaning towards a linked menu for the home pages and product pages for this reason, but an accordion type set of toggles would look nicer. What do you recommend, and how have you set up your read more toggles if they have lots of info so that they are not confusing? Are there other options? ' Not looking for code (I can do that) I'm looking for ideas on the cleanest home page, category pages, and product pages when they have tons and tons of textual content. Wanting to trim it up and make it look compact and neat! Thanks!
Web Design | | BobGW0 -
Multiple Similar Product Variations - Page layout, Title and SEO best practice??
Im doing some research into SEO for our new web design. I sell designer eyewear prescription and sunglasses. Lets take a Ray Ban Wayfarer sunglass it comes in 30 colours and 3 sizes for each model. Up till now i was of the impression that for best practice SEO i would need to have each individual variation as its own page, this would also help with things like google shopping too. So for example heres 1 colour product in 3 sizes of 30 colour variations for this particular model. Ray Ban Wayfarer RB2140
Web Design | | Craigboi1987
Colour: Black 901
Sizes: 47, 50, 54 Currently my urls looks like this with a new page and the size changing on the end for each variation. Ray Ban Wayfarer RB2140 - Black 901 - 47 URL: www.mywebsite.com/ray-ban-wayfarer-rb2140.html?colour=Black+901&size=47 Ray Ban Wayfarer RB2140 - Black 901 - 50 URL: www.mywebsite.com/ray-ban-wayfarer-rb2140.html?colour=Black+901&size=50 Ray Ban Wayfarer RB2140 - Black 901 - 54 URL: www.mywebsite.com/ray-ban-wayfarer-rb2140.html?colour=Black+901&size=54 This is very time consuming and I'm not sure if its adding any benefit to my SEO in fact scared its actually a) slowing my site down (content heavy)
b) looking like duplicate content I am thinking about moving towards a page more like this were it would be just be a model with variations. (not effecting the title/getting a new page per variation) http://demoleotheme.com/vigoss/index.php/atomic-endurance-running-tee-crew-neck.html I am not sure of the pros and cons of doing it this way over the way I'm doing it currently all i know is my site is ranking horribly. Lastly I'm currently running a magento V1.9 store which is renowned for duplicate content slow site speeds etc so have been told moving to woo commerce would benefit me for both site performance and seo but I'm skeptical as currently with this structure of a each SKU being a new page il be up to 8000+ products and multiple product variations that it can handle my needs, anyone with any experience on woo commerce platform? (this might be a operate question apologise) This is absolutely frying my brain so any advice appreciated. Im prepared to put every dying second into just need some solid advice in which direction to go!0 -
2 Menu links to same page. Is this a problem?
One of my clients wants to link to the same page from several places in the navigation menu. Does this create any crawl issues or indexing problems? It's the same page (same url) so there is no duplicate content problems. Since the page is promotional, the client wants the page accessible from different places in the nav bar. Thanks, Dino
Web Design | | Dino640 -
E-Commerce Website Architecture - Cannibalization between Product Categories and Blog Categories?
Hi, I have an e-commerce site that sells laptops. My main landing pages and category pages are as follows:
Web Design | | BeytzNet
"Toshiba Laptops", "Samsung Laptops", etc. We also run a WP blog with industry news.
The posts are divided into categories which are basically as our landing pages.
The posts themselves usually link to the appropriate e-commerce landing page.
For example: a post about a new Samsung Laptop which is categorized in the blog under "Samsung Laptops" will naturally link somewhere inside to the "samsung laptops" ecommerce landing page. Is that good or do the categories on the blog cannibalize my more important e-commerce section landing pages? Thanks0 -
Duplicate page title caused by Shopify CMS
Hi, We have an ecommerce site set up at devlinsonline.com.au using Shopify and the MOZ crawl is returning a huge number (hundreds!) of Duplicate Page Title errors. The issue seems to be the way that Shopify uses tagging to sort products. So, using the 'Riedel' collection as an example, the urls devlinsonline.com.au/collections/riedel-glasses/ devlinsonline.com.au/collections/riedel-glasses/decanters devlinsonline.com.au/collections/riedel-glasses/vinum all have the exact same page title. We are also having the same issue with the blog and other sections of our site. Is this something that is actually a serious issue or, perhaps, is Google's algorithm intelligent enough to recognise that this is part of Shopify's layout so it will not negatively affect our rankings and can, essentially, be ignored? Thanks.
Web Design | | SimonDevlin0 -
One big page vs. multi-step pages
Hi mozers! Brand new to SEO and LOVING it! Having several key questions that I don't see answered yet, but I'll start with one we've been very curious about. Consider this guide we have for Forming a Delaware Corp.
Web Design | | Mase
https://www.upcounsel.com/Free-Legal/Guide/17/Form-A-Delaware-Corporation This is our overview page, giving you a breakdown of what this process involves. We love this page, but (Question1:) does it lack better real "content" rather than lots of links to the guide process itself? Then, you can start to walk through the guide beginning with step one, where each step has crowd sourced answers to it. But as you see, the step pages are all very similar, except for the answers and step info. (Question 2) Would it be better to put all our answers into the one overview page and skip having separate pages for each step? We like the process and simplicity of seeing one step at a time, but then these pages don't seem to have enough unique content on them. Related, at what point (if any) is a page too big with too much content and considered bad for SEO? We're recovering from a big hit from Google, and slowly recovering by nailing down various SEO mistakes. We DO have great, unique and valueable content - now we just need it to rank!0 -
Transitioning to a dynamic home page. Impact on SEO?
Home page redesign advice, please. We're a growing college textbook publishing company; a unique one in that we publish everything under an open license. Our homepage www.flatworldknowledge.com has a solid page score (80), and since our product serves several different customers/audiences -- students, faculty, bookstores -- we're transitioning to a dynamic home page approach. Returning instructors will be served a personalized faculty page, returning students a student oriented page featuring the books they've most recently accessed, and first time/anon visitors will receive a more neutral welcome page until we know more about them. Pros, cons with this change to a dynamic homepage? What should we be thinking about/concerned about from an SEO perspective? How do you address title tags? Will this approach dilute page authority? Thanks all!
Web Design | | JasonBilog0 -
How much content is too much? Best Pages For Content?
To my understanding content has a lot to do with organic rankings if written correctly. My question is, how much content is too much and what pages are best to place content. Our company sells very costly products. Our customers call to purchase, we do not have an eCommerce site. Write now we have on average 350 words per page. We have about 200+ pages. Each page is written for that general category and each product has its own unique content. It seems to me that the pages with less content, tend to rank a bit better. As we are in the process of redoing our website, is there any recommendations on writing content, or adjusting the amount of text. I am thinking a lot of our text is informative only to a certain extent. Would writing content just for the main category page be better, and then on the actual product page, have only about 250 words as a description? Are there any other recommendations for SEO that are fairly new? Besides the Title, Description, Heading Tags, Image Alts, URLS etc.
Web Design | | hfranz0