Robots.txt file - How to block thosands of pages when you don't have a folder path
-
Hello.
Just wondering if anyone has come across this and can tell me if it worked or not.Goal:
To block review pagesChallenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only.
Question:
If I add the following to the Robots.txt file will it block all review pages?User-agent: *
Disallow: /default.aspx?z=reviewMuch thanks,
Davinia -
Also remember that blocking in robots.txt doesn't prevent Google from indexing those URLs. If the URLs are already indexed or if they are linked to, either internally or externally they may still in appear in the index with limited snippet information. If so, you'll need to add a noindex meta tag to those pages.
-
An * added to the end! Great thank you!
-
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449
Head down to the pattern matching section.
I think
User-agent: *
Disallow: /default.aspx?z=review*should do the trick though.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Incorrect Spelling Indexed In Meta Info - Can't Change It
Hi,It would be great if a member of the community could help me to resolve this issue.Google is indexing an incorrect spelling on of our key pages and we can't identify the reason why.- The page in question: https://newbridgesilverware.com/jewelleryAs you can see from the attached image, the Meta Title is rendered to contain the keyword "jewelry" (the American spelling.) We want this to read as "jewellery" - the British-English spelling. Yet in the page source the word is given in the meta title as "jewellery". Nowhere in the page source or on the page itself does the American spelling appear - yet Google still renders it in the Meta Title.Can anyone identify why this is happening and offer any possible solutions?Much appreciatedDhqJp
Intermediate & Advanced SEO | | Johnny_AppleSeed1 -
Why some websites can rank the keywords they don't have in the page?
Hello guys, Yesterday, I used SEMrush to search for the keyword "branding agency" to see the SERP. The Liquidagency ranks 5th on the first page. So I went to their homepage but saw no exact keywords "branding agency", even in the page source. Also, I didn't see "branding agency" as a top anchor text in the external links to the page (from the report of SEMrush). I am an SEO newbie, can someone explain this to me, please? Thank you.
Intermediate & Advanced SEO | | Raymondlee0 -
Can't diagnose this 404 error
Hi Moz community I have started receiving a load of 404 errors that look like this: This page: http://paulminors.com/blog/page/5/ is linking to: http://paulminors.com/category/podcast/paulminors.com which is a broken link. This is happening with a load of other pages as well. It seems that "paulminors.com" is being added to the end of the linking pages URL.I'm using Wordpress and the SEO by Yoast plugin. I have searched for this link in the source of the linking page but can't find it, so I'm struggling to diagnose the problem. Does anyone have any ideas on what could be causing this? Thanks in advance Paul
Intermediate & Advanced SEO | | kevinliao0 -
Should I use meta noindex and robots.txt disallow?
Hi, we have an alternate "list view" version of every one of our search results pages The list view has its own URL, indicated by a URL parameter I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"... Thanks 🙂
Intermediate & Advanced SEO | | ntcma0 -
Company name doesn't have keyword: use domains instead?
Good Morning! Now, I'll admit, I may be obsessing a little too much on this, and it may not make that big of an impact in the long run, but with Google being introduced to the world if I were to start a business today I would try and include my keyword into the title of my business. For example Dollar Shave Club, at least they got the word shave in there. My business doesn't have a keyword in our name, is it beneficial to structure our URLs to include a keyword so that all of our URLs include that word? So if I sell organic bananas, but my company is called Evananas, is it worth it to have all domains become a child of Evananas.com/organic_bananas? That way at least we have the keyword "Organic Bananas" in our title? So I could then have things like: evananas.com/organic_bananas/recipes evananas.com/organic_bananas/benefits evananas.com/organic_bananas/taste_really_freeking_good Vs. evananas.com/recipes evananas.com/benefits evananas.com/taste_really_freeking_good I'm not sure it makes a difference. The other problem is I want to keep our URL's as short as possible. I feel like less is always more, but I was always under the impression domain/URL based keywords were rather powerful. What is the best practice in this case? Thanks Guys! Evan(ana)
Intermediate & Advanced SEO | | HashtagHustler0 -
My PR 4 website won't rank for keywords that have very weak competition
I bought a real 1Yr old PR4 domain and used it to make a blog that would rank easily for new trending keywords (Ex: product launch keywords). I used Yoast SEO and made sure I did all the on-page recommendations it gave me and had linklicious ping the post and a couple high PR backlinks that I gave the page, but it won't even rank page 10 let alone index. My domain is indexed and the home page links to my post. I know a average amount of SEO but I hate doing it because stuff like this frustrates me. Can someone help me? Do I need to get certain backlinks? Is there a way to get my site and post to index faster? BTW the keywords i'm trying to rank for have websites that are brand spanking new some of them are blogspot websites. Most of them don't have a single backlink to them.
Intermediate & Advanced SEO | | Jamal41930 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0 -
New server update + wrong robots.txt = lost SERP rankings
Over the weekend, we updated our store to a new server. Before the switch, we had a robots.txt file on the new server that disallowed its contents from being indexed (we didn't want duplicate pages from both old and new servers). When we finally made the switch, we somehow forgot to remove that robots.txt file, so the new pages weren't indexed. We quickly put our good robots.txt in place, and we submitted a request for a re-crawl of the site. The problem is that many of our search rankings have changed. We were ranking #2 for some keywords, and now we're not showing up at all. Is there anything we can do? Google Webmaster Tools says that the next crawl could take up to weeks! Any suggestions will be much appreciated.
Intermediate & Advanced SEO | | 9Studios0