Magneto site with many pages
-
just finsihed scan to a magento site.
off course I am getting thousand of pages that are dynamic.
search pages and other.
checking with site command on Google I see 154,000 results
which pages it is recommended to block?
some people are talking about blocking the search pages and some actually talking about allowing them?
any answer on this?
Thanks
-
There was no effect on the rankings, no significant ups or downs. But now when I do the site:www.domain.com command in Google, I see just the pages that I want Google to index.
It can only help in the long run I guess.
-
Hi
sorry for responding late.
what was the affect on your results when u did block all those pages?
Thanks
-
yes my thought is blocking through robot.txt
-
I've had similar problems with a few Magento sites. This is a standard list I use in my robots.txt files (below.) I hope it helps.
You don't have to include all the Magento folders like 'app' 'lib' 'var' and 'admin' t etc hey are just there to be thorough.
I think you'll get the idea. I've brought the number of indexed pages down from half a million to just a few thousand using these.
Disallow: /*? Disallow: /*.js$ Disallow: /*.css$ Disallow: /404/ Disallow: /admin/ Disallow: /api/ Disallow: /app/ Disallow: /catalog/category/view/ Disallow: /catalog/product/view/ Disallow: /catalog/product_compare/ Disallow: /catalogsearch/ Disallow: /catalogsearch/advanced/ Disallow: /catalogsearch/term/ Disallow: /catalogsearch/term/popular/ Disallow: /cgi-bin/ Disallow: /checkout/ Disallow: /checkout/cart/ Disallow: /contacts/ Disallow: /contacts/index/ Disallow: /contacts/index/post/ Disallow: /customer/ Disallow: /customer/account/ Disallow: /customer/account/login/ Disallow: /downloader/ Disallow: /install/ Disallow: /js/ Disallow: /lib/ Disallow: /magento/ Disallow: /newsletter/ Disallow: /pkginfo/ Disallow: /private/ Disallow: /poll/ Disallow: /report/ Disallow: /review/ Disallow: /sendfriend/ Disallow: /skin/ Disallow: /tag/ Disallow: /var/ Disallow: /wishlist/
-
Hi there! When you say block, do you mean through your robots.txt?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Weird Site is linking to our site and links appears to be broken
I have got a lot of weird links indexed from this page: http://kzs.uere.info/files/images/dining-table-and-2-upholstered-chairs.html When clicking the link it shows 404. Also, the spam score is huge. What do you guys suggest to do with this?
Intermediate & Advanced SEO | | Miniorek
Could it be done by somebody to get our rankings down or domain penalized? Best Regards
Mike & Alex0 -
301 Redirect to Home Page or Sub-Page?
What do you think about 301 redirect of good expired domain to a sub-page instead of the home page? I'm doing this so I don't hurt my brand name. Let me know your thoughts please. Thank you
Intermediate & Advanced SEO | | JuanWork0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
Ecommerce Site homepage , Is it okay to have Links as H2 Tags as that is relevant to the page ?
Hi All, I have a Rental site and I am bit confused with how best do my H Tags on my homepage I know the H1 is the most important, Then H2 Tags and so on.. and that these tags should really be titles for content. However, I have a few categories (links) on my homepage so I am wondering if I could put these as H2 Tags given that it is relevant to the page . H3 Tags will my News and Guides etc , H4 Tags will the whats on the footer. I am attached a made up screenshot of what I propose for my homepage if someone could please give it a quick look , it would be very much appreciated. I have looked at what some competitors do a lot of them don't seem to have h2's etc but I know it's an important factor for rankings etc. Many thanks Pete dJSFQwI
Intermediate & Advanced SEO | | PeteC120 -
Is it a problem to use a 301 redirect to a 404 error page, instead of serving directly a 404 page?
We are building URLs dynamically with apache rewrite.
Intermediate & Advanced SEO | | lcourse
When we detect that an URL is matching some valid patterns, we serve a script which then may detect that the combination of parameters in the URL does not exist. If this happens we produce a 301 redirect to another URL which serves a 404 error page, So my doubt is the following: Do I have to worry about not serving directly an 404, but redirecting (301) to a 404 page? Will this lead to the erroneous original URL staying longer in the google index than if I would serve directly a 404? Some context. It is a site with about 200.000 web pages and we have currently 90.000 404 errors reported in webmaster tools (even though only 600 detected last month).0 -
Huge e-commerce site migration - what to do with product pages?
My very large e-commerce client is about to undergo a site migration in which every product page URL will be changing. I am already planning my 301 redirect process for the top ~1,000 pages on the site (categories, products, and more) but this will not account for the more than 1,000 products on the site. The client specified that they don't want to implement much more than 1,000 redirects so as to avoid impacting site performance. What is the best way to handle these pages without causing hundreds of 404 errors on site migration day? Thanks!
Intermediate & Advanced SEO | | FPD_NYC0 -
Category Pages up - Product Pages down... what would help?
Hi I mentioned yesterday how one of our sites was losing rank on product pages. What steps do you take to improve the SERPS of product pages, in this case home/category/product is the tree. There isn't really any internal linking, except one link from the category page to each product, would setting up a host of internal links perhaps "similar products" linking them together be a place to start? How can I improve my ranking of these more deeply internal pages? Not just internal links?
Intermediate & Advanced SEO | | xoffie0 -
Push for site-wide https, but all pages in index are http. Should I fight the tide?
Hi there, First Q&A question 🙂 So I understand the problems caused by having a few secure pages on a site. A few links to the https version a page and you have duplicate content issues. While there are several posts here at SEOmoz that talk about the different ways of dealing with this issue with respect to secure pages, the majority of this content assumes that the goal of the SEO is to make sure no duplicate https pages end up in the index. The posts also suggest that https should only used on log in pages, contact forms, shopping carts, etc." That's the root of my problem. I'm facing the prospect of switching to https across an entire site. In the light of other https related content I've read, this might seem unecessary or overkill, but there's a vaild reason behind it. I work for a certificate authority. A company that issues SSL certificates, the cryptographic files that make the https protocol work. So there's an obvious need our site to "appear" protected, even if no sensitive data is being moved through the pages. The stronger push, however, stems from our membership of the Online Trust Alliance. https://otalliance.org/ Essentially, in the parts of the internet that deal with SSL and security, there's a push for all sites to utilize HSTS Headers and force sitewide https. Paypal and Bank of America are leading the way in this intiative, and other large retailers/banks/etc. will no doubt follow suit. Regardless of what you feel about all that, the reality is that we're looking at future that involves more privacy protection, more SSL, and more https. The bottom line for me is; I have a site of ~800 pages that I will need to switch to https. I'm finding it difficult to map the tips and tricks for keeping the odd pesky https page out of the index, to what amounts to a sitewide migratiion. So, here are a few general questions. What are the major considerations for such a switch? Are there any less obvious pitfalls lurking? Should I even consider trying to maintain an index of http pages, or should I start work on replacing (or have googlebot replace) the old pages with https versions? Is that something that can be done with canonicalization? or would something at the server level be necessary? How is that going to affect my page authority in general? What obvious questions am I not asking? Sorry to be so longwinded, but this is a tricky one for me, and I want to be sure I'm giving as much pertinent information as possible. Any input will be very much appreciated. Thanks, Dennis
Intermediate & Advanced SEO | | dennis.globalsign0