Big problem with my new crawl report
-
I am owner of small opencart online store. I installed http://www.opencart.com/index.php?route=extension/extension/info&extension_id=6182&filter_search=seo. Today my new crawl report is awful. The number of errors is up by 520 (30 before), up with 1000 (120 before), notices up with 8000 (1000 before). I noticed that the problem is with search. There is a lot duplicate content in search only. What to do ?
-
Thank you again Alan.
Typo fixed.
-
I use Bing search API,
By the way, you want to change from GET to POST, not the other way around.
-
Alan,
Thank you for the great advice. If one has enough control over the eCommerce system, or the internal site search product, to change from GET to POST so these pages act more like real dynamically generated "search pages" than an infinite amount of "landing pages" I think that is a fantastic solution. It would keep merchandisers and others from linking to those pages - because we all know that they will continue to do it even if the SEO pleads on hands and knees for them to stop.
However, I have found it to be the case that most eCommerce businesses (from small mom-n-pop shops to fortune 500 companies) do not have the ability to do this because the internal site search functionality they use is out of their hands. Site search vendors like Endeca and Celebros serving enterprise eCommerce businesses don't typically hand over the keys to the client.
If you know any site search vendors or solutions that allow one to do this it would make a great contribution to this thread if you could share a few of them. I'd definitely look into recommending them in the future!
Thanks again!
-
The problem with PR leaks is that they are scalable, If you are losing 10%, then you get some quality links, 10% of them will be wasted, every effort you do in the future will be discounted by 10%.
There are ways to fix all these problems, for example I would make a search to be POST and not GET so that links to search pages can not be made and therefor search pages will not get indexed.
We work so hard to get good links, why waste them when you do?
-
I have tried different methods to fix this. First-hand experience tells me that oftentimes it is better to just block the paths (assuming there is better navigation on the site) from being crawled or indexed using robots.txt than to use a noindex,follow tag in order to save the pagerank you're sending via internal links. It is very easy for Google to get bogged down crawling around in the internal search results area.
Unless there are lots of links to search pages from top pages on the site, or a big list of search page links from every page (sitewide footer, for example) I really don't think the waste of internal pagerank is noticeable in the rankings, or worth salvaging if it risks sending spiders into a maze or a trap.
Yes, best practice is not to link to pages that you are blocking. In the real world though, search pages can be very useful to visitors, and to merchandisers who don't have the ability to create more targeted sub-sub-sub categories will often use them, and link to them on the site, as landing pages for promotional purposes (emails, PPC, sales...).
Everyone has their own strategies, and all we can do is make recommendations based on our own experience and knowledge. Thanks for helping out with this question Alan. Feel free to elaborate so Anastas has more input to help guide his decision.
-
as long as no one is linking to the search pages including internal links.
-
Hello Anastas,
I agree that you should block the search folder from being indexed. I'm going to assume that nobody is linking to your search pages and that you have other paths (e.g. SEO-friendly navigation, sitemaps...) for search engines to use to access your products).
I don't understand why you have formatted the disallow statement that way, however. Unless I'm missing something (and could be since I don't know what your site is) you only need to do this:
Disallow: /product/search*
And of course after doing this you should test it in GWT to make sure that A: You are blocking the pages you want to block, such as search pages with lots of parameters, and B: You are NOT blocking other pages you don't want to block, such as product pages. Here is more info on where to find the testing tool in GWT if you don't know: http://productforums.google.com/forum/#!topic/webmasters/tbikAxJiIZ4
Let us know how it goes. Good luck.
-
Please I need help
-
I am using opencart. I dont know what to do. Before I had 50 errors, now they are more than 500 after this plug in. The plug in removed the previous errors, but now there are many different errors. I have 2 options:
1. Remove the plug in
2. Do something with new errors - the new errors are only because of search, I have dublicate page content because when you type PDODUCT NAME in search box, there is same content as www.mydomain.com/category1/PRODUCT NAME
Maybe this plug in removed the canonical urls in search or I dont know what.
In robots.txt there is row:
Disallow: /*?route=product/search
The duplicate content is mydomain.com/product/search&filter_tag=XXXXXX
Instead of XXXXX there are many paths.
I decided to add another row in robots.txt:
Disallow: /*?route=product/search&filter_tag=/
Do you thing it is correct or to remove the plug in?
I hope you understand what is the problem.
-
When you no index a page, any links pointing to those pages pour away link juice from you indexed pages. you should never no-index pages IMO
I assume you are using a CMS or some sort of plug in, this is a common cost when you do so. CMS create very untidy code, not good for SEO
-
The urls are: /product/search&filter_tag=%D0%B1%D0%B8%D0%B6%D1%83%D1%82%D0%B0
after = there are a lot of combinations. Is it correct to put this in robots.txt
Disallow: /*?route=product/search&filter_tag=/
-
Sholud I disallow search (in robots.txt)?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl errors
Hi I have the following errors on my site and was wondering would it help improve my ranking to fix : Missing Meta Description Tag 137Duplicate Page Title 17Title Element is Too Long 6Temporary Redirect 3
On-Page Optimization | | WallerD0 -
Help an SEO-DUMMY : ) Established hyphenated domain...redirect?!...new domain?!
Hello, everybody. I am definitely not an SEO specialist. My family owns a transportation business (since 2010) and i am the one responsible for the website (until we find a good SEO company). My question: Several years ago i did not know much about SEO and have chosen a domain name www.airporttransportation-limo.com (it is not the actual domain...just an example...i'm not sure if i can post the real website here) and another domain that is just the name of our company (it also has hyphen in it). Both websites are still doing good and we receive quite a bit of traffic, but i read more an more about how hyphenated domains and domains with more then two worlds can be bad for your SEO/business/traffic. I feel like the websites are stuck and not moving up any more..could that be because of the hyphens? I registered another domain that is the name of our company (which is well known by now) without any hyphens. Now i have no idea what to do. Should i redirect both old domains (old websites are different and do not have duplicate content) to the new one, or should i just redirect the old domain (just the name of our company with hyphen) to a new one (without hyphen) and leave the www.airportransportation-limo.com as is... Or maybe i should register another domain without any hyphens (two words only) and redirect the www.airporttransportation-limo.com to it... I am very nervous to make any changes and loose all the traffic. My family will kill me. Please help! I'm lost!
On-Page Optimization | | KL20140 -
Is it impossible to get out of Panda? Matt Cutts says if you fix the problem you "pop back" but if so why are their so few examples?
In this video matt cutts says: http://www.youtube.com/watch?v=8IzUuhTyvJk about 15 "once we re-run our data (every few weeks) if we determine your site is of higher quality you would pop back out of being affected" Panda has effected thousands of sites and a lot of smart people have been working on the problem for about 2 years since the first panda was launched, but I can only find 1 site that has "popped back" to their original rankings. e.g. http://searchengineland.com/google-panda-two-years-later-losers-still-losing-one-real-recovery-149491 Apart from Motortrend.com I can't find any sites (of reasonable size) / case studies of sites that have solved the panda problem, and were definitely hit by panda. Which doesn't feel right, some people have deleted a ton of pages, redesigned their site, improved their content, etc with no success. Therefore is it a pointless exercise? Therefore, is it better to simply give up and start a new site?
On-Page Optimization | | julianhearn1 -
Why is SEOMOZ Crawl Diagnostics not in sync with Webmaster Tools
Currently, my Website, according to the Crawl Diagnostics Summary, has 401 'Duplicate Page Title Errors'. But in Google Webmaster Tools, under Óptimization on the Left hand Side Toolbar, if you look up HTML Improvements, there are only 4 'Duplicate Title Tags'. I have two questions re this: A) Do 'Duplicate Page Title Errors' and 'Duplicate Title Tags' have the same meaning' ? , and B) why are there 401 errors located by the former, and just 4 by the latter?
On-Page Optimization | | ABCPS0 -
WordPress Crawl Errors
I recently added wordpress to my site and get the following errors: Duplicate Page Content http://agrimapper.com/wordpress/ http://agrimapper.com/wordpress/index.php How do I define the canonical page on a .php. 4XX (Client Error) http://agrimapper.com/wordpress/index.phpindex.php Any ideas where the 4XX error comes from. Thanks.
On-Page Optimization | | MSSBConsulting0 -
Why that many Crawl Diagnostics errors, false?
Hi, my fist question, trying to understand how seomoz pro works. I have about 680 crawl erros, but when I check the details, I found this: 1. many 403 errors ( I think from all links). > I have tested my web site (telcelsolcom.com) in other tools and all says OK 200 response. 2. many title and content duplicates, but the system is showing as duplicated pages with and without the www. > I have a 301 redirect from non www to with www and it is working ok. Do I have false errors? What am I doing wrong? thanks.
On-Page Optimization | | hugoaf0 -
On page report card 410 error
I have been trying to test my site through the on page report card using our primary keyword phrase, however, I keep getting the following error message: We were unable to grade that page. The page did not load. Got a 410 response code from server If I try the same search and keyword phrase on other sites, it does work. Am I doing something wrong?
On-Page Optimization | | duesoon0 -
URL structure for a new WordPress site
Hi I'm building a new next big thing website from scratch (for a translation agency) and I encountered an issue with the URL structure. I need to chose the URL for important targeted keyword pages and I have a conflict between two tools I'm using. Please read below the situation: domain: mashtranslation.com target keyword: french translation services which URL you think is better from a SEO point of view (and possibly for users): mashtranslation.com/services/french/ OR mashtranslation.com/french-translation-services/ I'm asking this because one WordPress plugin (Wordpress SEO by Yoast) says the URL structure is not optimised while another tool (Market Samurai) says the URL is optimised.
On-Page Optimization | | flo20