Big problem with my new crawl report
-
I am owner of small opencart online store. I installed http://www.opencart.com/index.php?route=extension/extension/info&extension_id=6182&filter_search=seo. Today my new crawl report is awful. The number of errors is up by 520 (30 before), up with 1000 (120 before), notices up with 8000 (1000 before). I noticed that the problem is with search. There is a lot duplicate content in search only. What to do ?
-
Thank you again Alan.
Typo fixed.
-
I use Bing search API,
By the way, you want to change from GET to POST, not the other way around.
-
Alan,
Thank you for the great advice. If one has enough control over the eCommerce system, or the internal site search product, to change from GET to POST so these pages act more like real dynamically generated "search pages" than an infinite amount of "landing pages" I think that is a fantastic solution. It would keep merchandisers and others from linking to those pages - because we all know that they will continue to do it even if the SEO pleads on hands and knees for them to stop.
However, I have found it to be the case that most eCommerce businesses (from small mom-n-pop shops to fortune 500 companies) do not have the ability to do this because the internal site search functionality they use is out of their hands. Site search vendors like Endeca and Celebros serving enterprise eCommerce businesses don't typically hand over the keys to the client.
If you know any site search vendors or solutions that allow one to do this it would make a great contribution to this thread if you could share a few of them. I'd definitely look into recommending them in the future!
Thanks again!
-
The problem with PR leaks is that they are scalable, If you are losing 10%, then you get some quality links, 10% of them will be wasted, every effort you do in the future will be discounted by 10%.
There are ways to fix all these problems, for example I would make a search to be POST and not GET so that links to search pages can not be made and therefor search pages will not get indexed.
We work so hard to get good links, why waste them when you do?
-
I have tried different methods to fix this. First-hand experience tells me that oftentimes it is better to just block the paths (assuming there is better navigation on the site) from being crawled or indexed using robots.txt than to use a noindex,follow tag in order to save the pagerank you're sending via internal links. It is very easy for Google to get bogged down crawling around in the internal search results area.
Unless there are lots of links to search pages from top pages on the site, or a big list of search page links from every page (sitewide footer, for example) I really don't think the waste of internal pagerank is noticeable in the rankings, or worth salvaging if it risks sending spiders into a maze or a trap.
Yes, best practice is not to link to pages that you are blocking. In the real world though, search pages can be very useful to visitors, and to merchandisers who don't have the ability to create more targeted sub-sub-sub categories will often use them, and link to them on the site, as landing pages for promotional purposes (emails, PPC, sales...).
Everyone has their own strategies, and all we can do is make recommendations based on our own experience and knowledge. Thanks for helping out with this question Alan. Feel free to elaborate so Anastas has more input to help guide his decision.
-
as long as no one is linking to the search pages including internal links.
-
Hello Anastas,
I agree that you should block the search folder from being indexed. I'm going to assume that nobody is linking to your search pages and that you have other paths (e.g. SEO-friendly navigation, sitemaps...) for search engines to use to access your products).
I don't understand why you have formatted the disallow statement that way, however. Unless I'm missing something (and could be since I don't know what your site is) you only need to do this:
Disallow: /product/search*
And of course after doing this you should test it in GWT to make sure that A: You are blocking the pages you want to block, such as search pages with lots of parameters, and B: You are NOT blocking other pages you don't want to block, such as product pages. Here is more info on where to find the testing tool in GWT if you don't know: http://productforums.google.com/forum/#!topic/webmasters/tbikAxJiIZ4
Let us know how it goes. Good luck.
-
Please I need help
-
I am using opencart. I dont know what to do. Before I had 50 errors, now they are more than 500 after this plug in. The plug in removed the previous errors, but now there are many different errors. I have 2 options:
1. Remove the plug in
2. Do something with new errors - the new errors are only because of search, I have dublicate page content because when you type PDODUCT NAME in search box, there is same content as www.mydomain.com/category1/PRODUCT NAME
Maybe this plug in removed the canonical urls in search or I dont know what.
In robots.txt there is row:
Disallow: /*?route=product/search
The duplicate content is mydomain.com/product/search&filter_tag=XXXXXX
Instead of XXXXX there are many paths.
I decided to add another row in robots.txt:
Disallow: /*?route=product/search&filter_tag=/
Do you thing it is correct or to remove the plug in?
I hope you understand what is the problem.
-
When you no index a page, any links pointing to those pages pour away link juice from you indexed pages. you should never no-index pages IMO
I assume you are using a CMS or some sort of plug in, this is a common cost when you do so. CMS create very untidy code, not good for SEO
-
The urls are: /product/search&filter_tag=%D0%B1%D0%B8%D0%B6%D1%83%D1%82%D0%B0
after = there are a lot of combinations. Is it correct to put this in robots.txt
Disallow: /*?route=product/search&filter_tag=/
-
Sholud I disallow search (in robots.txt)?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Proper Use and Interpretation of new Query/Page report
When I'm in WMT/Search Console - I start a process of looking at all of the data initially unfiltered Then I select a query. Let's say its a top query for starters and I filter my results by that top query (exactly) With the filter on, I flip over to Pages and I get about a dozen results. When I look at this list, I get the normal variety of output: impressions, clicks, CTR, avg. position One thing that seems a bit odd to me is that most of the average positions for each of the URLs displayed is about the same. Say they range from 1.0 to 1.3. Does this mean that Google is displaying the dozen or so URLs to different people and generally in the 1st or 2nd position. Does this mean that my dozen or so pages are all competing with each other for the same query? On one hand, if all of my dozen pages displayed most of the time in the SERP all at the same time, I would see this as a good thing in that I would be 'owning' the SERP for my particular query. On the other hand, I'm concerned that the keyword I'm trying to optimize a particular page for is being partially distributed to less optimized pages. The main target page is shown the most (good) and it has about a 15x better CTR (also good). But all together, the other 11 pages are taking in around 40% of impressions and get a far lower CTR (bad). Am I interpreting this data correctly? Is WMT showing me what pages a particular query sends traffic to? Is there any way to extract the keywords that a particular page receives? When I reset my query and then start by selecting a specific page (exact match) and then select queries - is this showing my the search queries that drove traffic to that page? Is there a 'best practices' process to try to target a keyword to a specific page so that it gets more than the 60% of impressions I'm seeing now? Obviously I don't want to do a canonical because each keyword goes to many different pages and each page receives a different mix of keywords. I would think there would be a different technique when your page has an average position off of page 1.
On-Page Optimization | | ExploreConsulting0 -
On-Page Report Card is lacking
To SEOMOZ Developers: All my pages score A's in the On-Page Report Card, yet most are not indexed in the top 50 positions for Google!!! Don't you think there are other factors you're not considering when giving out A's? In other words, a page should not be awarded an A when it's not in top 50 ranking positions. That's seems to be a flaw in the Optimization Reports. There must be something my page is missing, but you are not pointing it out. Shouldn't your "A" requirements be more aligned with Google's ranking algorithms? In other words, most pages awarded A's should be found to be ranked in the top 50. And, most pages not ranked in the top 50 should recieve F's because they don't meet Google's ranking requirements. In other words, you should model your report card more closely to Google's ranking. Here's an example: Keyword: "enterprise time management" Page: www.stdtime.com/enterprise-time-management.htm
On-Page Optimization | | raywhite0 -
Problem with left navigation links on an e-commerce site diluting pagerank
I'm trying to decide how to deal with left navigation links on my e-commerce website diluting the amount of link juice passed to other links on the page. Any suggestions? Only options I can think of are: Nofollow the links use javascript (I'm assuming googlebots are still able to find these) Leave them as they are as followed links
On-Page Optimization | | Ralzaider0 -
Why Does SEOMOZ Crawl show that i have 5,769 pages with Duplicate Content
Hello... I'm trying to do some analysis on my site (http://goo.gl/JgK1e) and SEOMOZ Crawl Diagnostics is telling me that I have 5,769 pages with duplicate content. Can someone, anyone, please help me understand: how does SEOMOZ determine if i have duplicate content Is it correct ? Are there really that many pages of duplicate content How do i fix this, if true <---- ** Most important ** Thanks in advance for any help!!
On-Page Optimization | | Prime850 -
On Page Optimisation Reports
Firstly sorry if this has already been answered - I did look I promise.
On-Page Optimization | | Jock
Secondly sorry if the answer to this is blatently obvious! In the process of trying to optimise my landing pages, I am using On Page Optimisation reports. I have several (ok lots) with F grades which is not surprising as the landing page is not the landing page optimised for a certain keyword. If I change the landing page to the one that I have for a certain keyword then hey presto A or B grade (clever me)! Now here's the thing - presumably the landing page that is listed by default is the one that Google "sees" for a particular keyword. How do I change this if I can or do I have to be patient or am I just being plain daft?! Many thanks0 -
How To Prevent Crawling Shopping Carts, Wishlists, Login Pages
What's the best way to prevent engines from crawling your websites shopping cart, wishlist, log in pags, ect... Obviously have it in robots.txt but is their any other form of action that should be done?
On-Page Optimization | | Romancing0 -
Is there a report in SEOMoz that will show me what keywords each page ranks for on my site?
I would like to find all of the keywords not just the keywords that I specified in the tracking section.
On-Page Optimization | | Court_H0 -
How woud you deal with Blog TAGS & CATEGORY listings that are marked a 'duplicate content' in SEOmoz campaign reports?
We're seeing "Duplicate Content" warnings / errors in some of our clients' sites for blog / event calendar tags and category listings. For example the link to http://www.aavawhistlerhotel.com/news/?category=1098 provides all event listings tagged to the category "Whistler Events". The Meta Title and Meta Description for the "Whistler Events" category is the same as another other category listing. We use Umbraco, a .NET CMS, and we're working on adding some custom programming within Umbraco to develop a unique Meta Title and Meta Description for each page using the tag and/or category and post date in each Meta field to make it more "unique". But my question is .... in the REAL WORLD will taking the time to create this programming really positively impact our overall site performance? I understand that while Google, BING, etc are constantly tweaking their algorithms as of now having duplicate content primarily means that this content won't get indexed and there won't be any really 'fatal' penalties for having this content on our site. If we don't find a way to generate unique Meta Titles and Meta Descriptions we could 'no-follow' these links (for tag and category pages) or just not use these within our blogs. I am confused about this. Any insight others have about this and recommendations on what action you would take is greatly appreciated.
On-Page Optimization | | RoyMcClean0