Robots.txt file - How to block thosands of pages when you don't have a folder path

Unity

Hello.
Just wondering if anyone has come across this and can tell me if it worked or not.

Goal:
To block review pages

Challenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236

So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only.

Question:
If I add the following to the Robots.txt file will it block all review pages?

User-agent: *
Disallow: /default.aspx?z=review

Much thanks,
Davinia

Klarke

Also remember that blocking in robots.txt doesn't prevent Google from indexing those URLs. If the URLs are already indexed or if they are linked to, either internally or externally they may still in appear in the index with limited snippet information. If so, you'll need to add a noindex meta tag to those pages.

Unity

An * added to the end! Great thank you!

Klarke

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449

Head down to the pattern matching section.

I think

User-agent: *
Disallow: /default.aspx?z=review*

should do the trick though.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt file - How to block thosands of pages when you don't have a folder path

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Does creating too many parent pages damage my website's SEO?

Robots.txt Allowed

Why isn't the canonical tag on my client's Magento site working?

Should I disallow all URL query strings/parameters in Robots.txt?

Max Amout Of HTML Pages In A Folder

Robots.txt, does it need preceding directory structure?

Any ideas for capturing keywords that your client rejects because they aren't politically correct?

Subdomains - duplicate content - robots.txt