Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
De-indexing product "quick view" pages
-
Hi there,
The e-commerce website I am working on seems to index all of the "quick view" pages (which normally occur as iframes on the category page) as their own unique pages, creating thousands of duplicate pages / overly-dynamic URLs. Each indexed "quick view" page has the following URL structure:
www.mydomain.com/catalog/includes/inc_productquickview.jsp?prodId=89514&catgId=cat140142&KeepThis=true&TB_iframe=true&height=475&width=700
where the only thing that changes is the product ID and category number.
Would using "disallow" in Robots.txt be the best way to de-indexing all of these URLs? If so, could someone help me identify how to best structure this disallow statement? Would it be:
Disallow: /catalog/includes/inc_productquickview.jsp?prodID=*
Thanks for your help.
-
Just to add, if you block URLs in robots.txt they wont actually get deindexed. They will be for all intents and purposes be blocked (wont cause duplicate content issues etc) but they will drop into the omitted results:
_In order to show you the most relevant results, we have omitted some entries very similar to the 13 already displayed._If you like, you can repeat the search with the omitted results included. And will look like this in the SERPS (see attachment).If you want them removed from the SERPs you will need to use the robots NOINDEX meta tag, or use GWMT as William advised.
The disallow entry you posted will block these pages, as long as they all start with that way. Although you don't actually need the trailing wild card as that gets ignored, you can just leave it open. Google robots.txt specs
-
Thanks William. I think I will stick with the Robots file in this case. I am nervous about using that parameter feature in case ?prodID is used in any other URL that should be indexed.
-
You can use that in your robots.txt, which should work on crawls.
Or
you can also go into WMT and setup your parameters, in this case would be ?prodID.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Topical keywords for product pages and blogs
Hi all, I have a question regarding keywords. Of course we all know that keyword research should be focused on a certain topic and on user intent (and thus on answering specific questions) instead of trying to put keywords in a page to make it rank. However, duplicate content is of course still an issue. So here's my question: A client that sells floor heating systems that you can install yourself, has a product page for this topic and blog pages for questions regarding this topic. So following pages are on the website: Product page about the floor heating systems the client sells Blog article with tips how to install a floor heating system yourself Blog article about how to choose the right floor heating system These pages all answer different questions and are written about different topics. However, inevatibly all these pages also talk about different aspects of floor heating systems so this broad term comes up on all pages naturally. You could say that a solution is to merge pages and redirect the blogs to the product page, so the product page would answer all questions. But that is not what a customer is looking for. The goal of a product page is to trigger a conversion: let a customer contact the company or ask for a price offer. If the content on a product page is not comprehensive enough, the goal gets lost. Moreover, it doesn't make sense to talk about tips and tricks on a product page. So how do you tackle this problem without creating duplicate content? In search results, the blog pages rank for the specific questions, but the product page doesn't rank for the generic term 'floor heating'. The internal link structure is ok: the product page has obviously more incoming links than the blogs. All on page SEO factors are taken care of as well. Any ideas on this? Thanks!
Intermediate & Advanced SEO | | Mat_C0 -
E-Commerce Site Collection Pages Not Being Indexed
Hello Everyone, So this is not really my strong suit but I’m going to do my best to explain the full scope of the issue and really hope someone has any insight. We have an e-commerce client (can't really share the domain) that uses Shopify; they have a large number of products categorized by Collections. The issue is when we do a site:search of our Collection Pages (site:Domain.com/Collections/) they don’t seem to be indexed. Also, not sure if it’s relevant but we also recently did an over-hall of our design. Because we haven’t been able to identify the issue here’s everything we know/have done so far: Moz Crawl Check and the Collection Pages came up. Checked Organic Landing Page Analytics (source/medium: Google) and the pages are getting traffic. Submitted the pages to Google Search Console. The URLs are listed on the sitemap.xml but when we tried to submit the Collections sitemap.xml to Google Search Console 99 were submitted but nothing came back as being indexed (like our other pages and products). We tested the URL in GSC’s robots.txt tester and it came up as being “allowed” but just in case below is the language used in our robots:
Intermediate & Advanced SEO | | Ben-R
User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkout
Disallow: /9545580/checkouts
Disallow: /carts
Disallow: /account
Disallow: /collections/+
Disallow: /collections/%2B
Disallow: /collections/%2b
Disallow: /blogs/+
Disallow: /blogs/%2B
Disallow: /blogs/%2b
Disallow: /design_theme_id
Disallow: /preview_theme_id
Disallow: /preview_script_id
Disallow: /apple-app-site-association
Sitemap: https://domain.com/sitemap.xml A Google Cache:Search currently shows a collections/all page we have up that lists all of our products. Please let us know if there’s any other details we could provide that might help. Any insight or suggestions would be very much appreciated. Looking forward to hearing all of your thoughts! Thank you in advance. Best,0 -
Should I use noindex or robots to remove pages from the Google index?
I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed. From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else. I can add the noindex tag to the review pages but they wont be crawled. https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?
Intermediate & Advanced SEO | | Tylerj0 -
Home page suddenly dropped from index!!
A client's home page, which has always done very well, has just dropped out of Google's index overnight!
Intermediate & Advanced SEO | | Caro-O
Webmaster tools does not show any problem. The page doesn't even show up if we Google the company name. The Robot.txt contains: Default Flywheel robots file User-agent: * Disallow: /calendar/action:posterboard/
Disallow: /events/action~posterboard/ The only unusual thing I'm aware of is some A/B testing of the page done with 'Optimizely' - it redirects visitors to a test page, but it's not a 'real' redirect in that redirect checker tools still see the page as a 200. Also, other pages that are being tested this way are not having the same problem. Other recent activity over the last few weeks/months includes linking to the page from some of our blog posts using the page topic as anchor text. Any thoughts would be appreciated.
Caro0 -
My blog is indexing only the archive and category pages
Hi there MOZ community. I am new to the QandA and have a question. I have a blog Its been live for months - but I can not get the posts to rank in the serps. Oddly only the categories rank. The posts are crawled it seems - but seen as less important for a reason I don't understand. Can anyone here help with this? See here for what i mean. I have had several wp sites rank well in the serps - and the posts do much better. Than the categories or archives - super odd. Thanks to all for help!
Intermediate & Advanced SEO | | walletapp0 -
"noindex, follow" or "robots.txt" for thin content pages
Does anyone have any testing evidence what is better to use for pages with thin content, yet important pages to keep on a website? I am referring to content shared across multiple websites (such as e-commerce, real estate etc). Imagine a website with 300 high quality pages indexed and 5,000 thin product type pages, which are pages that would not generate relevant search traffic. Question goes: Does the interlinking value achieved by "noindex, follow" outweigh the negative of Google having to crawl all those "noindex" pages? With robots.txt one has Google's crawling focus on just the important pages that are indexed and that may give ranking a boost. Any experiments with insight to this would be great. I do get the story about "make the pages unique", "get customer reviews and comments" etc....but the above question is the important question here.
Intermediate & Advanced SEO | | khi50 -
Best practice for retiring old product pages
We’re a software company. Would someone be able to help me with a basic process for retiring old product pages and re-directing the SEO value to new pages. We are retiring some old products to focus on new products. The new software has much similar functionality to the old software, but has more features. How can we ensure that the new pages get the best start in life? Also, what is the best way of doing this for users? Our plan currently is to: Leave the old pages up initially with a message to the user that the old software has been retired. There will also be a message explaining that the user might be interested in one of our new products and a link to the new pages. When traffic to these pages reduces, then we will delete these pages and re-direct them to the homepage. Has anyone got any recommendations for how we could approach this differently? One idea that I’m considering is to immediately re-direct the old product pages to the new pages. I was wondering if we could then provide a message to the user explaining that the old product has been retired but that the new improved product is available. I’d also be interested in pointing the re-directs to the new product pages that are most relevant rather than the homepage, so that they get the value of the old links. I’ve found in the past that old retirement pages for products can outrank the new pages as until you 301 them then all the links and authority flow to these pages. Any help would be very much appreciated 🙂
Intermediate & Advanced SEO | | RG_SEO0 -
Is it a bad idea to have a "press" page and link to press mentions of our company?
We've recently been getting quite a bit of press. Would it be wise to create a "press" page and link to mentions of us or would this devalue the links on the press pages as Google may think they reciprocal?
Intermediate & Advanced SEO | | JenniferDacosta0