De-indexing product "quick view" pages
-
Hi there,
The e-commerce website I am working on seems to index all of the "quick view" pages (which normally occur as iframes on the category page) as their own unique pages, creating thousands of duplicate pages / overly-dynamic URLs. Each indexed "quick view" page has the following URL structure:
www.mydomain.com/catalog/includes/inc_productquickview.jsp?prodId=89514&catgId=cat140142&KeepThis=true&TB_iframe=true&height=475&width=700
where the only thing that changes is the product ID and category number.
Would using "disallow" in Robots.txt be the best way to de-indexing all of these URLs? If so, could someone help me identify how to best structure this disallow statement? Would it be:
Disallow: /catalog/includes/inc_productquickview.jsp?prodID=*
Thanks for your help.
-
Just to add, if you block URLs in robots.txt they wont actually get deindexed. They will be for all intents and purposes be blocked (wont cause duplicate content issues etc) but they will drop into the omitted results:
_In order to show you the most relevant results, we have omitted some entries very similar to the 13 already displayed._If you like, you can repeat the search with the omitted results included. And will look like this in the SERPS (see attachment).If you want them removed from the SERPs you will need to use the robots NOINDEX meta tag, or use GWMT as William advised.
The disallow entry you posted will block these pages, as long as they all start with that way. Although you don't actually need the trailing wild card as that gets ignored, you can just leave it open. Google robots.txt specs
-
Thanks William. I think I will stick with the Robots file in this case. I am nervous about using that parameter feature in case ?prodID is used in any other URL that should be indexed.
-
You can use that in your robots.txt, which should work on crawls.
Or
you can also go into WMT and setup your parameters, in this case would be ?prodID.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index, follow on a paginated page with a different rel=canonical URL
Hello, I have a question about meta robots ="index, follow" and rel=canonical on category page pagination. Should the sorted page be <meta name="robots" content="index,follow"></meta name="robots" content="index,follow"> since the rel="canonical" is pointing to a separate page that is different from the URL? Any thoughts on this topic would be awesome. Thanks. Main Category Page
Intermediate & Advanced SEO | | Choice
https://www.site.com/category/
<meta name="robots" content="index,follow"><link rel="canonical" href="https: www.site.com="" category="" "=""></link rel="canonical" href="https:></meta name="robots" content="index,follow"> Sorted Page
https://www.site.com/category/?p=2&dir=asc&order=name
<meta name="robots" content="index, follow"=""><link rel="canonical" href="https: www.site.com="" category="" ?p="2""></link rel="canonical" href="https:></meta name="robots" content="index,> As you can see, the meta robots is telling Google to index https://www.site.com/category/?p=2&dir=asc&order=name , yet saying the canonical page is https://www.site.com/category/?p=2 .0 -
I still see the old page in index
Hello, I have done a redirect and still see in google index my old page after 3 weeks. My new page is there also Is it normal that the old page isn't dropped for the index yet ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Fetch as Google -- Does not result in pages getting indexed
I run a exotic pet website which currently has several types of species of reptiles. It has done well in SERP for the first couple of types of reptiles, but I am continuing to add new species and for each of these comes the task of getting ranked and I need to figure out the best process. We just released our 4th species, "reticulated pythons", about 2 weeks ago, and I made these pages public and in Webmaster tools did a "Fetch as Google" and index page and child pages for this page: http://www.morphmarket.com/c/reptiles/pythons/reticulated-pythons/index While Google immediately indexed the index page, it did not really index the couple of dozen pages linked from this page despite me checking the option to crawl child pages. I know this by two ways: first, in Google Webmaster Tools, if I look at Search Analytics and Pages filtered by "retic", there are only 2 listed. This at least tells me it's not showing these pages to users. More directly though, if I look at Google search for "site:morphmarket.com/c/reptiles/pythons/reticulated-pythons" there are only 7 pages indexed. More details -- I've tested at least one of these URLs with the robot checker and they are not blocked. The canonical values look right. I have not monkeyed really with Crawl URL Parameters. I do NOT have these pages listed in my sitemap, but in my experience Google didn't care a lot about that -- I previously had about 100 pages there and google didn't index some of them for more than 1 year. Google has indexed "105k" pages from my site so it is very happy to do so, apparently just not the ones I want (this large value is due to permutations of search parameters, something I think I've since improved with canonical, robots, etc). I may have some nofollow links to the same URLs but NOT on this page, so assuming nofollow has only local effects, this shouldn't matter. Any advice on what could be going wrong here. I really want Google to index the top couple of links on this page (home, index, stores, calculator) as well as the couple dozen gene/tag links below.
Intermediate & Advanced SEO | | jplehmann0 -
Sort term product pages and fast indexing - XML sitemaps be updated daily, weekly, etc?
Hi everyone, I am currently working on a website that the XML sitemap is set to update weekly. Our client has requested that this be changed to daily. The real issue is that the website creates short term product pages (10-20 days) and then the product page URL's go 404. So the real problem is quick indexing not daily vs weekly sitemap. I suspect that daily vs weekly sitemaps may help solve the indexing time but does not completely solve the problem. So my question for you is how can I improve indexing time on this project? The real problem is how to get the product pages indexed and ranking before the 404 page shows u?. . Here are some of my initial thoughts and background on the project. Product pages are only available for 10 to 20 days (Auction site).Once the auction on the product ends the URL goes 404. If the pages only exist for 10 to 20 days (404 shows up when the auction is over), this sucks for SEO for several reasons (BTW I was called onto the project as the SEO specialist after the project and site were completed). Reason 1 - It is highly unlikely that the product pages will rank (positions 1 -5) since the site has a very low Domain Authority) and by the time Google indexes the link the auction is over therefore the user sees a 404. Possible solution 1 - all products have authorship from a "trustworthy" author therefore the indexing time improves. Possible solution 2 - Incorporate G+ posts for each product to improve indexing time. There is still a ranking issue here since the site has a low DA. The product might appear but at the bottom of page 2 or 1..etc. Any other ideas? From what I understand, even though sitemaps are fed to Google on a weekly or daily basis this does not mean that Google indexes them right away (please confirm). Best case scenario - Google indexes the links every day (totally unrealistic in my opinion), URL shows up on page 1 or 2 of Google and slowly start to move up. By the time the product ranks in the first 5 positions the auction is over and therefore the user sees a 404. I do think that a sitemap updated daily is better for this project than weekly but I would like to hear the communities opinion. Thanks
Intermediate & Advanced SEO | | Carla_Dawson0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Mystery: Ranking in Amazon for a product page?
My client has a product on Amazon that has more reviews and better rankings. However, their competitor with less reviews and lower ratings are ranking #1 for our primary keyword in Google. Our product page doesn't even rank on Google, but I'm assuming Google doesn't want to display two results from Amazon. The only difference is they have 1 link pointed to the product page that has a small PA of 10 and DA of 15. Do you think this link could be the only thing making a difference? Should we start building more links to this product page in addition to their website? Any other tips to help our Amazon page rank?
Intermediate & Advanced SEO | | Stryde0 -
Amount of pages indexed for classified (number of pages for the same query)
I've notice that classified usually has a lots of pages indexed and that's because for each query/kw they index the first 100 results pages, normally they have 10 results per page. As an example imagine the site www.classified.com, for the query/kw "house for rent new york" there is the page www.classified.com/houses/house-for-rent-new-york and the "index" is set for the first 100 SERP pages, so www.classified.com/houses/house-for-rent-new-york www.classified.com/houses/house-for-rent-new-york-1 www.classified.com/houses/house-for-rent-new-york-2 ...and so on. Wouldn't it better to index only the 1st result page? I mean in the first 100 pages lots of ads are very similar so why should Google be happy by indexing lots of similar pages? Could Google penalyze this behaviour? What's your suggestions? Many tahnks in advance for your help.
Intermediate & Advanced SEO | | nuroa-2467120 -
Google is indexing wordpress attachment pages
Hey, I have a bit of a problem/issue what is freaking me out a bit. I hope you can help me. If i do site:www.somesitename.com search in Google i see that Google is indexing my attachment pages. I want to redirect attachment URL's to parent post and stop google from indexing them. I have used different redirect plugins in hope that i can fix it myself but plugins don't work. I get a error:"too many redirects occurred trying to open www.somesitename.com/?attachment_id=1982 ". Do i need to change something in my attachment.php fail? Any idea what is causing this problem? get_header(); ?> /* Run the loop to output the attachment. * If you want to overload this in a child theme then include a file * called loop-attachment.php and that will be used instead. */ get_template_part( 'loop', 'attachment' ); ?>
Intermediate & Advanced SEO | | TauriU0