Need help with Robots.txt
-
An eCommerce site built with Modx CMS. I found lots of auto generated duplicate page issue on that site. Now I need to disallow some pages from that category. Here is the actual product page url looks like
product_listing.php?cat=6857And here is the auto generated url structure
product_listing.php?cat=6857&cPath=dropship&size=19Can any one suggest how to disallow this specific category through robots.txt. I am not so familiar with Modx and this kind of link structure.
Your help will be appreciated.
Thanks
-
I would actually add a canonical tag and then handle these using the Parameters section of Search Console. That's why it's there, for exactly this type of site with exactly this issue.
-
Nahid, before you use the robots.txt file's disallow for those URLs, you may want to reconsider. You may want to use the canonical tag instead. In the case where you have different sizes, colors, etc. we typically recommend using the Canonical Tag and not the disallow in robots.txt.
Anyhow, if you'd like to use the disallow you can use one of these:
Disallow: /?
or
Disallow: /?cat=
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What happens to crawled URLs subsequently blocked by robots.txt?
We have a very large store with 278,146 individual product pages. Since these are all various sizes and packaging quantities of less than 200 product categories my feeling is that Google would be better off making sure our category pages are indexed. I would like to block all product pages via robots.txt until we are sure all category pages are indexed, then unblock them. Our product pages rarely change, no ratings or product reviews so there is little reason for a search engine to revisit a product page. The sales team is afraid blocking a previously indexed product page will result in in it being removed from the Google index and would prefer to submit the categories by hand, 10 per day via requested crawling. Which is the better practice?
Intermediate & Advanced SEO | | AspenFasteners1 -
The "webmaster" disallowed all ROBOTS to fight spam! Help!!
One of the companies I do work for has a magento site. I am simply the SEO guy and they work the website through some developers who hold access to their systems VERY tightly. Using Google Webmaster Tools I saw that the robots.txt file was blocking ALL robots. I immediately e-mailed out and received a long reply about foreign robots and scrappers slowing down the website. They told me I would have to provide a list of only the good robots to allow in robots.txt. Please correct me if I'm wrong.. but isn't Robots.txt optional?? Won't a bad scrapper or bot still bog down the site? Shouldn't that be handled in httaccess or something different? I'm not new to SEO but I'm sure some of you who have been around longer have run into something like this and could provide some suggestions or resources I could use to plead my case! If I'm wrong.. please help me understand how we can meet both needs of allowing bots to visit the site but prevent the 'bad' ones. Their claim is the site is bombarded by tons and tons of bots that have slowed down performance. Thanks in advance for your help!
Intermediate & Advanced SEO | | JoshuaLindley0 -
Need help for improving SEO App?
Please refer our guru99 https://play.google.com/store/apps/details?id=com.vector.guru99&hl=en We have tons of free materials related to SAP in the app. The problem we are facing is we do not rank for terms like SAP or SAP tutorial and discovery is an issue I have searched over the internet and found no concrete solution. Can you experts help ?
Intermediate & Advanced SEO | | Chirag7530 -
Effect duration of robots.txt file.
in my web site there is demo site in that also, index in Google but no need it now.so i have created robots file and upload to server yesterday.in the demo folder there are some html files,and i wanna remove all these in demo file from Google.but still in web master tools it showing User-agent: *
Intermediate & Advanced SEO | | innofidelity
Disallow: /demo/ How long this will take to remove from Google ? And are there any alternative way doing that ?0 -
Emergency Help...
Hello All, I'm trying to get a better handle on this, but any help would be hugely appreciated. Per my Pro account, i just found out that the keyword i was severely trying to rank for "Boston Wedding Phot*grapher" i just declined by over 40 positions. Just last week i was in the #3 position. Needless to say, this is extremely bad. I feel sick from it. This is my livelyhood. I recently hired a 'so-called' SEO expert to look at it, but i'm having my doubts. I'm using a php based site with a wordpress blog. He added a bunch of 301 redirects from pages that the crawler was complaining about to my .htaccess file. He also installed the following plugins: Link Juice Keeper NoFollow Free The SEO Rich Snippets Udinra All Image Sitemap WP Robots Txt WP-PageNavi Add Meta Tags These are essentially the only changes made. Does anyone see anything blaring and/or obvious? I could really really use some help. My blog link is : http://www.symbolphoto.com/blog/ I'm assuming it's the blog because that's where most of my site content is located. Any advice is hugely appreciated. TIA.
Intermediate & Advanced SEO | | symbolphoto0 -
Need bullet points for a new website on what to do for SEO
Hello, My company just launched a new website and its a competitve market it looks like. Its for moving boxes and moving supplies. They want a bullet point list (nothing real specific) of what I will be doing for SEO for the new website. I have been out of the loop for more than a year with SEO so not sure what the best things to do first are. Any help would be great. Thanks John
Intermediate & Advanced SEO | | maximumrank0 -
Robots.txt is blocking Wordpress Pages from Googlebot?
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
Intermediate & Advanced SEO | | ENSO0 -
Reciprocal Links and nofollow/noindex/robots.txt
Hypothetical Situations: You get a guest post on another blog and it offers a great link back to your website. You want to tell your readers about it, but linking the post will turn that link into a reciprocal link instead of a one way link, which presumably has more value. Should you nofollow your link to the guest post? My intuition here, and the answer that I expect, is that if it's good for users, the link belongs there, and as such there is no trouble with linking to the post. Is this the right way to think about it? Would grey hats agree? You're working for a small local business and you want to explore some reciprocal link opportunities with other companies in your niche using a "links" page you created on your domain. You decide to get sneaky and either noindex your links page, block the links page with robots.txt, or nofollow the links on the page. What is the best practice? My intuition here, and the answer that I expect, is that this would be a sneaky practice, and could lead to bad blood with the people you're exchanging links with. Would these tactics even be effective in turning a reciprocal link into a one-way link if you could overlook the potential immorality of the practice? Would grey hats agree?
Intermediate & Advanced SEO | | AnthonyMangia0