Magento Hidden Products & Google Not Found Errors
-
We recently moved our website over to the Magento eCommerce platform. Magento has functionality to make certain items not visible individually so you can, for example, take 6 products and turn it into 1 product where a customer can choose their options. You then hide all the individual products, leaving only that one product visible on the site and reducing duplicate content issues.
We did this. It works great and the individual products don't show up in our site map, which is what we'd like. However, Google Webmaster Tools has all of these individual product URLs in its Not Found Crawl Errors.
! For example:
White t-shirt URL: /white-t-shirt
Red t-shirt URL: /red-t-shirt
Blue t-shirt URL: /blue-t-shirt
All of those are not visible on the site and the URLs do not appear in our site map. But they are all showing up in Google Webmaster Tools.
Configurable t-shirt URL: /t-shirt
This product is the only one visible on the site, does appear on the site map, and shows up in Google Webmaster Tools as a valid URL. !
Do you know how it found the individual products if it isn't in the site map and they aren't visible on the website? And how important do you think it is that we fix all of these hundreds of Not Found errors to point to the single visible product on the site? I would think it is fairly important, but don't want to spend a week of man power on it if the returns would be minimal.
Thanks so much for any input!
-
I also have this same issue.... looking for a solution.
-
Hi there,
I know this is an older question, but I wanted to follow up to see if you found a solution. We're currently experiencing the same issue with simple/not visible individually URLs appear in WMT's crawl error report.
Thanks in advance for any help!
-
The problem I am currently facing is a site where the individual 'Simple Products' were not set to be invisible. They have all been indexed!
So I have the same question really.
-
Yes, the page is modified based on selected options and it is done via Javascript.
-
OK. In a product page, the individual page is modified by a customer selecting options? If so, is that accomplished with Javascript substitution?
-
So those individual pages never actually appear on the site. They are just created to allow Magento to pull inventory on those items from that configurable product, which is why I'm not sure how Google is finding them.
For example, if I go to mysite.com/white-t-shirt, I would get a 404 (and if I searched for it, nothing would come up) because as far as the world outside of Magento admin is concerned, that URL doesn't exist.
-
How do those pages appear? In response to an AJAX call? A sitesearch? I've been hearing that Google has increased their ability to crawl pages that appear through dynamic calls.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Fetch & Render
Hi I've done Google fetch & render of this page & I have images which Google/customers aren't seeing - how do I identify the problems with this page? http://www.key.co.uk/en/key/chairs
Intermediate & Advanced SEO | | BeckyKey1 -
Does non-critical AMP errors prevent you from being featured on Top Stories Carousel?
Consider site A which is a news publishing site that has valid AMP pages with non-critical AMP pages (as notified within Search Console). Also, Site A publishes news articles from site B (its partner site) and posts it on site A which have AMP pages too but most of them are not valid AMP pages with critical AMP errors. For brand terms like Economic Times, it does show a top stories carousel for all articles published by Economic Times, however it doesn't look the same for site A (inspite of it having valid AMP pages). Image link: http://tinypic.com/r/219bh9j/9 Now that there are valid AMP pages from site A and invalid AMP pages from site B on site A, there have been instances wherein a news article from site A features on the top stories carousel on Desktop for a certain query whereas it doesn't feature on the mobile SERPs in spite of the page being a valid AMP page. For example, as mentioned in the screenshot below: Business Today ranks on the Top Stories carousel for a term like “jio news” on Desktop, but on Mobile although the page is a valid AMP page, it doesn’t show as an AMP page within the Top Stories Carousel. Image Link: http://tinypic.com/r/11sc8j6/9 There have been some cases where although the page is featured on the top carousel on desktop, the same article doesn't show up on the mobile version for the same query on the Top Stories Carousel. What could be the reason behind this? Also, would it be necessary to solve both critical and non-critical errors on site A (including those published from site B on site A)?
Intermediate & Advanced SEO | | Starcom_Search1 -
Structured Data and Google Rich Cards for products
It appears Google is moving towards the Rich Cards JSON-LD for all data. https://webmasters.googleblog.com/2016/05/introducing-rich-cards.html However on an ecommerce site when I have schema.org microdata structured data inline for a product and then I add the JSON-LD structured data Google treats that as two products on the page even though they are the same. To make the matter more confusing Bing doesn't appear to support JSON-LD. I can go back to the inline structured data only, but that would mean when Rich Cards for products eventually come I won't be ready. What do you recommend I do for long term seo, go back to the old or press forward with JSON-LD?
Intermediate & Advanced SEO | | K-WINTER0 -
VisitSweden indexing error
Hi all Just got a new site up about weekend travel for VisitSweden, the official tourism office of Sweden. Everything went just fine except som issues with indexing. The site can be found here at weekend.visitsweden.com/no/ For some weird reason the "frontpage" of the site does not get indexed. What I have done myself to find the issue: Added sitemaps.xml Configured and added site to webmaster tools Checked 301s so they are not faulty By doing a simple site:weekend.visitsweden.com/no/ you can see that the frontpage is simple not in the index. Also by doing a cache:weekend.visitsweden.com/no/ I see that Google tries to index the page without the trailing /no/ for some reason. http://webcache.googleusercontent.com/search?q=cache:http://weekend.visitsweden.com/no/ Any smart ideas to get this fixed or where to start looking? All help greatly appreciated Kind regards Fredrik
Intermediate & Advanced SEO | | Resultify0 -
Google Fetch Issue
I'm having some problems with what google is fetching and what it isn't, and I'd like to know why. For example, google IS fetching a non-existent page but listing it as an error: http://www.gaport.com/carports but the actual url is http://www.gaport.com/carports.htm. Google is NOT able to fetch http://www.gaport.com/aluminum/storage-buildings-10x12.htm. It says the page doesn't exist (even though it does) and when I click on the not found link in Google fetch it adds %E@%80%8E to the url causing the problem. One theory we have is that this may be some sort of server/hosting problem, but that's only really because we can't figure out what we could have done to cause it. Any insights would be greatly appreciated. Thanks and Happy Holidays! Ruben
Intermediate & Advanced SEO | | KempRugeLawGroup0 -
Multiple Google+ Local (Google Place) under one email address
As a automotive dealership group, we have 15+ business listings set up under one Google+ local account. Google+ Local (Google Places) offers the ability to upload a data file for 10+ listings, so we've kept all listings under one login for efficiency. Is there any specific local SEO benefit or any general benefit at all to having each business listing set up under their own separate email address?
Intermediate & Advanced SEO | | autoczar0 -
Google bot vs google mobile bot
Hi everyone 🙂 I seriously hope you can come up with an idea to a solution for the problem below, cause I am kinda stuck 😕 Situation: A client of mine has a webshop located on a hosted server. The shop is made in a closed CMS, meaning that I have very limited options for changing the code. Limited access to pagehead and can within the CMS only use JavaScript and HTML. The only place I have access to a server-side language is in the root where a Defualt.asp file redirects the visitor to a specific folder where the webshop is located. The webshop have 2 "languages"/store views. One for normal browsers and google-bot and one for mobile browsers and google-mobile-bot.In the default.asp (asp classic). I do a test for user agent and redirect the user to one domain or the mobile, sub-domain. All good right? unfortunately not. Now we arrive at the core of the problem. Since the mobile shop was added on a later date, Google already had most of the pages from the shop in it's index. and apparently uses them as entrance pages to crawl the site with the mobile bot. Hence it never sees the default.asp (or outright ignores it).. and this causes as you might have guessed a huge pile of "Dub-content" Normally you would just place some user-agent detection in the page head and either throw Google a 301 or a rel-canon. But since I only have access to JavaScript and html in the page head, this cannot be done. I'm kinda running out of options quickly, so if anyone has an idea as to how the BEEP! I get Google to index the right domains for the right devices, please feel free to comment. 🙂 Any and all ideas are more then welcome.
Intermediate & Advanced SEO | | ReneReinholdt0 -
How to prevent Google from crawling our product filter?
Hi All, We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl. On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless. In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls. We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway. What can we do to prevent Google from crawling all the filter options? Thanks in advance for the help. Kind regards, Gerwin
Intermediate & Advanced SEO | | footsteps0