Need help with Robots.txt
-
An eCommerce site built with Modx CMS. I found lots of auto generated duplicate page issue on that site. Now I need to disallow some pages from that category. Here is the actual product page url looks like
product_listing.php?cat=6857And here is the auto generated url structure
product_listing.php?cat=6857&cPath=dropship&size=19Can any one suggest how to disallow this specific category through robots.txt. I am not so familiar with Modx and this kind of link structure.
Your help will be appreciated.
Thanks
-
I would actually add a canonical tag and then handle these using the Parameters section of Search Console. That's why it's there, for exactly this type of site with exactly this issue.
-
Nahid, before you use the robots.txt file's disallow for those URLs, you may want to reconsider. You may want to use the canonical tag instead. In the case where you have different sizes, colors, etc. we typically recommend using the Canonical Tag and not the disallow in robots.txt.
Anyhow, if you'd like to use the disallow you can use one of these:
Disallow: /?
or
Disallow: /?cat=
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Disallowed "Search" results with robots.txt and Sessions dropped
Hi
Intermediate & Advanced SEO | | Frankie-BTDublin
I've started working on our website and I've found millions of "Search" URL's which I don't think should be getting crawled & indexed (e.g. .../search/?q=brown&prefn1=brand&prefv1=C.P. COMPANY|AERIN|NIKE|Vintage Playing Cards|BIALETTI|EMMA PAKE|QUILTS OF DENMARK|JOHN ATKINSON|STANCE|ISABEL MARANT ÉTOILE|AMIRI|CLOON KEEN|SAMSONITE|MCQ|DANSE LENTE|GAYNOR|EZCARAY|ARGOSY|BIANCA|CRAFTHOUSE|ETON). I tried to disallow them on the Robots.txt file, but our Sessions dropped about 10% and our Average Position on Search Console dropped 4-5 positions over 1 week. Looks like over 50 Million URL's have been blocked, and all of them look like all of them are like the example above and aren't getting any traffic to the site. I've allowed them again, and we're starting to recover. We've been fixing problems with getting the site crawled properly (Sitemaps weren't added correctly, products blocked from spiders on Categories pages, canonical pages being blocked from Crawlers in robots.txt) and I'm thinking Google were doing us a favour and using these pages to crawl the product pages as it was the best/only way of accessing them. Should I be blocking these "Search" URL's, or is there a better way about going about it??? I can't see any value from these pages except Google using them to crawl the site.0 -
Merging Two Sites: Need Help!
I have two existing e-commerce sites. The older one, is built on the Yahoo platform and had limitations as far as user experience. The new site is built on the Magento 2 platform. We are going to be using SLI search for our search and navigation on the new Magento platform. SLI wants us to 301 all of our categories to the hosted category pages they will create, that will have a URL structure akin to site.com/shop/category-name.html. The issue is: If I want to merge the two sites, I will have to do a 301 to the category pages of the new site, which will have 301s going to the category pages hosted by SLI. I hope this makes sense! The way I see it, I have two options: Do a 301 from the old domain to categories of the new domain, and have the new domain's categories 301 to the SLI categories; or, I can do my 301s directly to the SLI hosted category pages. The downside of #1 is that I will be doing two 301s, and I know I will lose more link juice as a result. The upside of #1, is that if decide not to use SLI in the future, it is one less thing to worry about. The downside of #2, is that I will be directing all the category pages from the old site to a site I do not ultimately control. I appreciate any feedback.
Intermediate & Advanced SEO | | KH20171 -
Our parent company has included their sitemap links in our robots.txt file - will that have an impact on the way our site is crawled?
Our parent company has included their sitemap links in our robots.txt file. All of their sitemap links are on a different domain and I'm wondering if this will have any impact on our searchability or potential rankings.
Intermediate & Advanced SEO | | tsmith1310 -
Robots.txt Blocked Most Site URLs Because of Canonical
Had a bit of a "Gotcha" in Magento. We had Yoast Canonical Links extension which worked well , but then we installed Mageworx SEO Suite.. which broke Canonical Links. Unfortunately it started putting www.mysite.com/catalog/product/view/id/516/ as the Canonical Link - and all URLs with /catalog/productview/* is blocked in Robots.txt So unfortunately We told Google that the correct page is also a blocked page. they haven't been removed as far as I can see but traffic has certainly dropped. We have also , at the same time had some Site changes grouping some pages & having 301 redirects. Resubmitted site map & did a fetch as google. Any other ideas? And Idea how long it will take to become unblocked?
Intermediate & Advanced SEO | | s_EOgi_Bear0 -
Help with domain configuration in international markets
We have just taken on a new client who’s core business is based in the UK. They are currently developing a market in Australia and are hoping to develop future markets over the next few years. The website is currently hosted in the UK, under a .co.uk domain, we are redeveloping the website and are wondering if we should move to the .com as the company is now catering to an international market. I was also wondering what the best way of establishing a presence in Australia, using the main website, rather than creating a new one. Thanks Fraser
Intermediate & Advanced SEO | | fraserhannah0 -
Reciprocal Links and nofollow/noindex/robots.txt
Hypothetical Situations: You get a guest post on another blog and it offers a great link back to your website. You want to tell your readers about it, but linking the post will turn that link into a reciprocal link instead of a one way link, which presumably has more value. Should you nofollow your link to the guest post? My intuition here, and the answer that I expect, is that if it's good for users, the link belongs there, and as such there is no trouble with linking to the post. Is this the right way to think about it? Would grey hats agree? You're working for a small local business and you want to explore some reciprocal link opportunities with other companies in your niche using a "links" page you created on your domain. You decide to get sneaky and either noindex your links page, block the links page with robots.txt, or nofollow the links on the page. What is the best practice? My intuition here, and the answer that I expect, is that this would be a sneaky practice, and could lead to bad blood with the people you're exchanging links with. Would these tactics even be effective in turning a reciprocal link into a one-way link if you could overlook the potential immorality of the practice? Would grey hats agree?
Intermediate & Advanced SEO | | AnthonyMangia0 -
Google +one button - help needed
I have installed Google plus button on my homepage. My GA report is being displayed like this - 7 total events were recorded Total Events Unique Events 7 2 Event action Total Events on 4 off 3 My questions are - 1 ) Is someone clicking on Google plus button considered an Event ? 2 ) What is meant by 2 Unique Events 3 ) Why does GA report shows on 4 and off 3
Intermediate & Advanced SEO | | seoug_20050 -
Category Pages - Canonical, Robots.txt, Changing Page Attributes
A site has category pages as such: www.domain.com/category.html, www.domain.com/category-page2.html, etc... This is producing duplicate meta descriptions (page titles have page numbers in them so they are not duplicate). Below are the options that we've been thinking about: a. Keep meta descriptions the same except for adding a page number (this would keep internal juice flowing to products that are listed on subsequent pages). All pages have unique product listings. b. Use canonical tags on subsequent pages and point them back to the main category page. c. Robots.txt on subsequent pages. d. ? Options b and c will orphan or french fry some of our product pages. Any help on this would be much appreciated. Thank you.
Intermediate & Advanced SEO | | Troyville0